EDmodel / ED2

Ecosystem Demography Model
78 stars 112 forks source link

h5 output question #231

Closed istfer closed 3 years ago

istfer commented 6 years ago

I have a question regarding h5 outputs:

When I give a run that starts from Jan 1st 2005, e.g.:

   NL%IMONTHA  = 01
   NL%IDATEA   = 01
   NL%IYEARA   = 2005 
   NL%ITIMEA   = 0000

I always get an analysis-T-2004-00-00-000000-g01.h5 file that has 1 value in it, and 2005.h5 file has one missing value. Has someone else experienced this before? Is it related to how I initialize runs and use the namelist file? or does it have something to do with fortran indices?

manfredo89 commented 6 years ago

Do you really need the T files? So far I've not seen many people using them but I may be wrong. To prevent writing them you can set NL%ITOUTPUT = 0 in the ED2IN

mdietze commented 6 years ago

@manfredo89 Yes, we definitely need the T files! They're the primary means for calibrating and validating ED2 against fast-time scale variables (e.g. eddy covariance) and also the default files that PEcAn processes into our across-models standard output format.

manfredo89 commented 6 years ago

Ok. I was also writing this because when I started using ED 2 years ago, I had some crashes that (I believe) I solved by turning off T file output. This is probably unrelated but I always felt like the flag should be turned off if there is no specific need to have this output.

istfer commented 6 years ago

Is @mrjohnston using tower files? Maybe she encountered this before.

I can handle this mismatch on the post-process end but I was wondering whether there is an easy fix on the settings or maybe in the code.

ashehad commented 6 years ago

Try running from the start of the growing season (e.g. 1st June) if it's a relatively cold site that you're simulating. Maybe this will fix your issue.

istfer commented 6 years ago

Thanks, but having the 2004 file is the symptom, not the problem.

That way I won't get the 2004 file, yes. But the outputs are still shifted for one time-step so that it doesn't align perfectly with flux tower data. Half an hour misalignment might not be a big issue normally (in fact, I didn't even recognize it from time series plots against data until I started comparing day-to-day), but it changes model-data comparison metrics, and thus my model calibration results.

mrjohnston commented 6 years ago

@istfer I haven't used T files for ages, and never for anything that wasn't quite quick-and-dirty. Sorry not to be of more help.

istfer commented 6 years ago

thanks everyone for the answers.

Can it be due to the early increment of irec_opt here?

I think it should start as 0 and increased after this chunk.

I mean, should it look like this:

           if (outmonth == 1 .and. outdate == 1 .and. outhour == 0) then
               new_file = .true.
               irec_opt = 0
            end if
            irec  = irec_opt
            if (irec == 0) new_file = .true.

            irec_opt = irec_opt + 1

so that it can start writing the value from the first value instead of bumping it to the previous year's h5 file.

What do you think?

istfer commented 6 years ago

Me again.. I finally had to debug this because it was overwriting the output file immediately before the restart date during my stop/restart experiments with the tower outputs. So I started printing out some time info.

Turns out it wasn't an indexing issue as I suggested in my previous comment. Two fixes solve my issue, and they would only concern OPTI file outputting. But one of them could be relevant to all outputs, so I wanted to note it down (see #2 below).

1) When I start a run from Jan 1st (e.g. from 2005/01/01 00:00:00), the h5 output skips the first time step because model time is being updated before output driver is called. Therefore the rest of the output is being shifted. My solution is to add a call to h5_output subroutine before entering the loop e.g. if (itoutput /= 0) call h5_output('OPTI') as it seems to be the case for other output types.

2) I realized that the main model driver ed_model.F90 and h5_output functions were tracking different times with 6 hrs discrepancy. First I thought it was a timezone thing, but looks like a 6 hrs subtraction (time-21600.0d0) is hard coded to date_add_to calls throughout the h5_output subroutine:

call date_add_to(iyeara,imontha,idatea,itimea*100,time-21600.0d0,'s',outyear   &
,outmonth,outdate,outhour)

As far as I can tell, this goes back to the very beginning since @mpaiao initially committed the code, and I assume after that it was copy-pasted to other places. I'm going to remove that hardcoded value from the date_add_to call under the OPTI case in an upcoming PR. Not sure why it was there in the first place or whether we want to remove it from other calls as well.

mpaiao commented 6 years ago

@istfer The hardcoded subtraction is needed for many of the output files. I can't tell anything about 'OPTI' and 'YEAR' because I have not implemented these, but for 'DAIL','MONT', and 'DCYC', this subtraction makes sure that the time stamp corresponds to the average period.

Just to give an example, let's say we are integrating monthly averages of February 2018. The model updates all the "mmean" variables throughout the month, and the average is ready for output when the model time is March 1st 2018 at 00UTC. When the code creates the file, this shift of 6 hours sends the time back to February so the monthly average of February 2018 will be in a file like myhist-E-2018-02-00-000000-g01.h5; the hour and day are ignored in this case. If we didn't subtract these hours, the February averages would end up in file myhist-E-2018-03-00-000000-g01.h5, which would be confusing. Notice that neither the history nor the inst files ("case default") have this shift in time.

istfer commented 6 years ago

Thanks for the explanation @mpaiao. Likewise it sends 2005 values to a 2004 file for -T- outputs, lagging behind the actual time. It probably shouldn't be the case for OPTI outputs.

I guess two things then: first..isn't this a hack? I mean, isn't there somewhere else that this could be fixed? (e.g. the check where we decide when to write monthly outputs etc.)

And second..Is there a particular reason for this shift to be exactly 6 hrs, i.e. why not 4 hrs or 12 hrs? If this has to stay, then it might be nice to assign this value to a named variable and document the reasoning a little bit.

mpaiao commented 6 years ago

@istfer This part of the code is really old — actually older than the "very beginning", the date_add_to routine came from RAMS/BRAMS. But I completely agree that this offset should not be hardcoded, and that the rationale should be explained in the code. The other option would be to add a logical variable as a new argument to date_add_to, and use this variable to decide whether to shift the time stamp to the previous day or not.

Under the current approach, there is no particular reason for this offset to be 6h: 4 or 12 hours would work fine too. I have a vague recollection that this number had to be bigger than a tiny offset (like 1sec) because of numerical precision of the 'time' variable. It also has to be less than 86400sec so the -D- files have the correct time stamp.