EC-Earth / ece2cmor3

Post-processing and cmorization of ec-earth output
Apache License 2.0
15 stars 6 forks source link

QA-DKRZ in EC-Earth #504

Closed zklaus closed 5 years ago

zklaus commented 5 years ago

Here we collect information about QA-DKRZ as it is applied in EC-Earth and how we progress in solving outstanding issues. It is the followup issue to #497.

The idea here is to keep everything we know and learn about how to work with QA-DKRZ in this first post. Feel free to edit it as epiphanies occur. The second post contains every open question, particularly those annotations that still should be dealt with.

Lessons so far:

zklaus commented 5 years ago

@oloapinivad made a test run and shared the following information:

So I installed the github version using the Uwe's to do list (even if with some difficulties) and run a fresh QA-DKRZ with CHECK_MODE=TIME,DATA,CNSTY,CF,DRS,DRS_F,DRS_P that produces a new report. Of course CV is still missing.

Annotations: http://wilma.to.isac.cnr.it/ecearth/diag/CMIP6/chis/infocmor/EC-Earth-Consortium_EC-Earth3_historical_r4i1p1f1.json Logfile: http://wilma.to.isac.cnr.it/ecearth/diag/CMIP6/chis/infocmor/EC-Earth-Consortium_EC-Earth3_historical_r4i1p1f1.log

A note on QA-DKRZ speed: it took 50 minutes with 18 cores and 5 years (i.e. about 1350 files)

Looking at the annotations and ignoring those that are already dealt with by the lessons learned so far in the first comment in this issues, we have the following outstanding:

Status Tag Description Comment
#506 CF_0b Unused dimension bnds
IS-ENES-Data/QA-DKRZ#21 R25600 Variable <*>: _FillValue at rec# <0>
#462 a65b CMOR The entry wap could not be found in CMOR table
#462 8067 CMOR The entry zg could not be found in CMOR table
Open CF_33e Attribute :units = <1> is not CF compatible with standard_name=region Seems a legitimate complain, but needs more investigation wrt flag values in CF
#471 6_15 Variable empty data body
#471 6_2 Data set entirely of _FillValue
oloapinivad commented 5 years ago

Reporting from #471 and the Veg configuration too. I also added 6_1: "Variable <*>: Entire file of const value=0.” since there are many instances in the Veg

  1. AOGCM historical:
    • 6_15 + 6_2: Omon variables ficeberg and hfibthermds
    • 6_1: Omon variables wfcorr and hfcorr (we decided to keep this)
  2. Veg historical:
    • 6_15 + 6_2: Omon variable fgcfc12
      • 6_1: from different tables variables fDeforestToProduct, fLuc, fLulccProductLut, fNProduct, fProductDecomp, fProductDecompLut, nProduct, fracInLut, fracOutLut, shrubFrac, cProduct, residualFrac.
    • R200: from different tables variables cTotFireLut, fFireNat, fLuc, fProductDecomp, fProductDecompLut, treeFracNdlDcd, fFire, fGrazing (however these seem much more reasonable: some of the events of LPJG occurs only once per year - https://dev.ec-earth.org/issues/670 - so that several monthly fields could be empty. We need some LPJG expert anyhow for this and the above one)

The annotations from the Veg historical are here: http://wilma.to.isac.cnr.it/ecearth/diag/CMIP6/vhis/infocmor/EC-Earth-Consortium_EC-Earth3-Veg_historical_r4i1p1f1.json

@zklaus can you add in the table above also the errors/warnings we have detected but ignored (i.e. the ones you mentioned in the first post, as CF251e and CF732a)

What was decision on CF_73c? I think I missed it.

BTW, 8_8a and 8_8b are solved setting PT_PATH_INDEX=2,3,6,8 (this is discussed in #497)

zklaus commented 5 years ago

Thanks, @oloapinivad, that's really helpful! :beers:

One little quib: my github handle is @zklaus. You are sending a bunch of notifications to a certain Klaus Ita, whom I don't know.

oloapinivad commented 5 years ago

One little quib: my github handle is @zklaus. You are sending a bunch of notifications to a certain Klaus Ita, whom I don't know.

ahahah thanks for spotting

zklaus commented 5 years ago

@klaus can you add in the table above also the errors/warnings we have detected but ignored (i.e. the ones you mentioned in the first post, as CF251e and CF732a)

I deliberately left them out, but I just added an explanation to the very first post in this issue of my idea. Basically, the first post contains what we know, the second what we must still figure out. Both should be edited along our journey.

What was decision on CF_73c? I think I missed it.

Well spotted! I added it to the first post (another issue in QA-DKRZ; should be ignored by us for now).

BTW, 8_8a and 8_8b are solved setting PT_PATH_INDEX=2,3,6,8 (this is discussed in #497)

Hm, could you try again with the original PT_PATH_INDEX=2,3,4 ? I suspect that the real error (apart from the wrong documentation that confused us) was a wrong frequency detection which prevented this mechanism from working properly. Now that the frequency is fixed, maybe it works as intended. That would provide better coverage.

Also, could you try the CV test one more time? @h-dh thinks this should be solved by now, so would be nice to confirm.

oloapinivad commented 5 years ago

Well, neither PT_PATH_INDEX=2,3,4 nor CV are working for me. The former still gives me the checksum error, while for the latter I opened a new issue https://github.com/IS-ENES-Data/QA-DKRZ/issues/20.

@ufladrich did you experience the same?

uwefladrich commented 5 years ago

@oloapinivad I've been busy producing new cmorised data. Back to qa-dkrz tomorrow, probably.

zklaus commented 5 years ago

I guess we should go with PT_PATH_INDEX=2,3,6,8 then, though I am not sure how effective that will be. Oh, one question: Did you wipe the result directories before retrying? Otherwise there maybe have been an old pt_... file lying around, giving spurious results.

As for the CV, I think you are right that we should keep the issue in QA-DKRZ open. But is there any more output that you could share there? The error messages you posted definitely look similar, but not quite the same and there is little context to judge the exact origin of the error. Maybe you could post the entire output of that run as a .txt attachment?

oloapinivad commented 5 years ago

I guess we should go with PT_PATH_INDEX=2,3,6,8 then, though I am not sure how effective that will be. Oh, one question: Did you wipe the result directories before retrying? Otherwise there maybe have been an old pt_... file lying around, giving spurious results.

Yes every QA-DKRZ run is from scratch

As for the CV, I think you are right that we should keep the issue in QA-DKRZ open. But is there any more output that you could share there? The error messages you posted definitely look similar, but not quite the same and there is little context to judge the exact origin of the error. Maybe you could post the entire output of that run as a .txt attachment?

Done.

So the question is now: are any of these remaining open issues preventing us from data publication? To my eyes the only extremely relevant are the 6_15 and _6_2 discussed in issue 606 609 of EC-Earth portal (which is currently offline).

zklaus commented 5 years ago

@oloapinivad you mean 609, right?

@h-dh just commented on the CV issue. Basically, CHECK_MODE=CV activates QA-DKRZ's CV checking, but this has been abandoned because now Prepare checks for cv problems. Hence, CV just should remain deactivated and probably will be completely removed from qa-dkrz at some point.

One less problem, hurray!

oloapinivad commented 5 years ago

@oloapinivad you mean 609, right?

Yes of course sorry for the typo, corrected.

@h-dh just commented on the CV issue. Basically, CHECK_MODE=CV activates QA-DKRZ's CV checking, but this has been abandoned because now Prepare checks for cv problems. Hence, CV just should remain deactivated and probably will be completely removed from qa-dkrz at some point.

One less problem, hurray!

I have just seen it! Very good!

oloapinivad commented 5 years ago

As you can see from #502 the PrePARE wap and zg issue is gone. Indeed, the annotations from QA-DKRZ is no longer reporting any of these error. @zklaus, can you update the table at the top of the post?

zklaus commented 5 years ago

I can and I have :smile:. In the future feel free to do these kind of edits yourself, though if you prefer I will also be happy to do them upon request.

As for the question of publication, I agree that we can publish now. Indeed, we already have published an AMIP and a historical run; the piControl is scheduled for the next week, and we hope to follow up with the scenarios really soon.

What we do with any problematic variables (think 6_15 style errors) is usually just to remove them from the catalog prior to publication.

oloapinivad commented 5 years ago

I can and I have 😄. In the future feel free to do these kind of edits yourself, though if you prefer I will also be happy to do them upon request.

Actually I don't think I have the permissions to edit your posts... or at least I don't know how do it 😄

What we do with any problematic variables (think 6_15 style errors) is usually just to remove them from the catalog prior to publication.

👍

I am now digging into R25600: from a first inspection the files involved are ok, but they are characterized by an evolving mask in time. Indeed, they are all related to sea-ice. An example here, you can see the missing point changing in time.

~/scratch/ece3/chis/cmorized/cmor_1850/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3/historical/r4i1p1f1/SImon/sitemptop/gn/v20190626> cdo info sitemptop_SImon_EC-Earth3_historical_r4i1p1f1_gn_185001-185012.nc 
    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter ID
     1 : 1850-01-16 12:00:00       0   105704   94540 :      224.31      259.00      273.15 : -1            
     2 : 1850-02-15 00:00:00       0   105704   96060 :      224.94      256.16      273.15 : -1            
     3 : 1850-03-16 12:00:00       0   105704   95932 :      221.99      257.71      273.15 : -1            
     4 : 1850-04-16 00:00:00       0   105704   94766 :      228.84      259.60      273.15 : -1            
     5 : 1850-05-16 12:00:00       0   105704   93297 :      236.99      263.98      273.15 : -1            
     6 : 1850-06-16 00:00:00       0   105704   91769 :      237.51      266.14      273.15 : -1            
     7 : 1850-07-16 12:00:00       0   105704   90620 :      233.58      265.98      273.15 : -1            
     8 : 1850-08-16 12:00:00       0   105704   90411 :      215.20      264.46      273.15 : -1            
     9 : 1850-09-16 00:00:00       0   105704   90646 :      237.16      263.05      273.15 : -1            
    10 : 1850-10-16 12:00:00       0   105704   90256 :      239.68      262.45      273.15 : -1            
    11 : 1850-11-16 00:00:00       0   105704   90445 :      232.85      262.98      273.15 : -1            
    12 : 1850-12-16 12:00:00       0   105704   91612 :      226.19      262.41      273.15 : -1            
cdo info: Processed 1268448 values from 1 variable over 12 timesteps [0.08s 36MB]

Could it be that this non-constant values of missing points triggers the QA-DKRZ _FIllValue warning?

zklaus commented 5 years ago

As far as I understand the QA-DKRZ source code, it checks in src/QA_data.cpp if there are any _FillValues in variables that have unit "K". I don't understand why. Maybe it's time for another issue at QA-DKRZ?

oloapinivad commented 5 years ago

As far as I understand the QA-DKRZ source code, it checks in src/QA_data.cpp if there are any _FillValues in variables that have unit "K". I don't understand why. Maybe it's time for another issue at QA-DKRZ?

And this explains why we have it only for sea-ice temperature related variables. BTW, this was turned off for CMIP5. I will ask to h-dh.

zklaus commented 5 years ago

Perfect. Oh yeah, about the editing: There should be a little area with three dots in the top right corner of my comment. If you click there, you should be presented with a small pop-up menu that contains an entry for "edit". However, maybe some rights issue prevents you from seeing that?

zklaus commented 5 years ago

Ok, @h-dh just confirmed that R25600 is obsolete (thanks!). That means we only have CF33e and the various 6... left.

Thanks for all the work, @oloapinivad !

I will have a look at CF_33e, but probably only next week.

The 6_ errors seem to be most likely to actually point to problematic variables that we have left. @oloapinivad made a list in an earlier comment already, but maybe it is useful to track them separately from this issue since it is no longer about the QA-DKRZ tool, but really about our data.

@treerink, what is currently the preferred way to track issues with variables? Should we resolve this by configuration? By Experiment?

uwefladrich commented 5 years ago

I've completed a qa-dkrz run on our r1* ScenarioMIP experiments. Here's the annotation file for the SSP1-2.6 experiment (the other three are similar): EC-Earth-Consortium_EC-Earth3-Veg_ssp126_r1i1p1f1.json.txt

EDIT: As a summary: All of the tags that my run reports have been mentioned in this issue. So no new problems appeared :-)

uwefladrich commented 5 years ago

I am now digging into R25600: from a first inspection the files involved are ok, but they are characterized by an evolving mask in time. Indeed, they are all related to sea-ice. An example here, you can see the missing point changing in time. [...]

There is a counter example, unfortunately: LImon/tsn, which has nothing to do with sea-ice (Snow Internal Temperature) and has constant number of missing values. We need another hypothesis.

oloapinivad commented 5 years ago

There is a counter example, unfortunately: LImon/tsn, which has nothing to do with sea-ice (Snow Internal Temperature) and has constant number of missing values. We need another hypothesis.

Actually I was wrong, the full story is well explained here: https://github.com/IS-ENES-Data/QA-DKRZ/issues/21 The test has been disabled in the most recent commit so I think that R25600 can be ignored for now.

oloapinivad commented 5 years ago

I've completed a qa-dkrz run on our r1* ScenarioMIP experiments. Here's the annotation file for the SSP1-2.6 experiment (the other three are similar): EC-Earth-Consortium_EC-Earth3-Veg_ssp126_r1i1p1f1.json.txt

EDIT: As a summary: All of the tags that my run reports have been mentioned in this issue. So no new problems appeared :-)

This is very good! I think that the last effort is that we have to dig a bit more in the LPJG variables that triggers the empy-related warnings as R200, 6_1, 6_15 and 6_2. We probably need to list them all in a separated issue.

uwefladrich commented 5 years ago

With the risk of stating/repeating the obvious, here are some of the variables with their meaning:

Can we ping some LPJG/PISCES person to tick off the variables that EC-Earth just doesn't have or those that have a good reason for being zero?

treerink commented 5 years ago

I think you have to ask David & Lars.

treerink commented 5 years ago

@treerink, what is currently the preferred way to track issues with variables? Should we resolve this by configuration? By Experiment?

By error/problem I would say. But feel free to do it in a way which is most easy, most structured.

warlind commented 5 years ago

fLuc {Emon}: is working as it should. Don't know why you are getting zeros. No land use in your run? treeFracNdlDcd {Emon}: is working as it should, but we only have one PFT of this kind (BNS - Larix) and it grows usually very bad and as it is colder in our runs than CRUNCEP it grows even less fFire {Lmon}: is working as it should. Might be that the run you are looking at have fire turned of (iffire 0) in the instruction file (global.ins) fGrazing {Lmon}: is working as it should. Don't know why you are getting zeros. cCwd {Lmon}: It is a placeholder (set to zero) as we don't have a coarse woody dead organic matter pool that is distinct from litter yet. Will have in the future. Definition from CMIP6 Data Request (http://clipc-services.ceda.ac.uk/dreq/mipVars.html)

oloapinivad commented 5 years ago
  • 6_1: from different tables variables fDeforestToProduct, fLulccProductLut, fNProduct, fProductDecomp, fProductDecompLut, nProduct, fracInLut, fracOutLut, shrubFrac, cProduct, residualFrac.

@warling could you give us some advice also about these? They come from a historical simulation with LPJG. this is the warnings we get:

   {
            "DRS_6": [ "Emon", "Lmon" ],
            "DRS_7": [ "cTotFireLut", "fFireNat", "fLuc", "fProductDecomp", "fProductDecompLut", "treeFracNdlDcd", "fFire", "fGrazing" ],
            "DRS_8": [ "gr" ],
            "annotation": "Warning: Data record totally with constant value <0>.",
            "tag": "R200",
            "severity": "Warning"
        },
        {
            "DRS_6": [ "Emon", "Eyr", "Lmon" ],
            "DRS_7": [ "fDeforestToProduct", "fLuc", "fLulccProductLut", "fNProduct", "fProductDecomp", "fProductDecompLut", "nProduct", "fracInLut", "fracOutLut", "shrubFrac", "cProduct", "residualFrac" ],
            "DRS_8": [ "gr" ],
            "annotation": "Variable <*>: Entire file of const value=0.",
            "example": "Variable <fDeforestToProduct>: Entire file of const value=0.",
            "tag": "6_1",
            "severity": "Warning"
        },

Thanks a lot for any help!

warlind commented 5 years ago

What kind of simulation was this? Without any land-use change fDeforestToProduct will be zero.

For cTotFireLut is don't know why you get zeros. I have outputs in my runs. Same here as with fFire, Fire might be turned off.

oloapinivad commented 5 years ago

Re. fires, I have just checked and fires are active in our simulation (i.e iffire 1 in output.ins) so I guess that also in @uwefladrich ones they are ok.

Tag R200 - as long as I understand - means that one timestep is completely empty: but LPJG (https://dev.ec-earth.org/issues/670) computes many variables in december so that it is completely reasonable to have empty values in all the other months. Thus sorry it was my fault, these are not a problem.

Conversely, Tag 6_1 means that an entire cmorized file, i.e. an entire year at least, is completely empty. I have checked the first 5 years of my simulation and this is what I get

Does anybody know if there is any form of spinup so that the first year make sense to have empty values? I can also add that as long as I know shrubFrac and residualFrac should be zero since LPJG has no shrubs.

uwefladrich commented 5 years ago

I'm afraid we haven't listed R100 yet. I get it on our DECK runs (piControl, 4xCO2, 1pctCO2):

> ack -B4 R100 heap-*/QC/qa-dkrz/check_logs/Annotations/*.json
heap-04-pict/QC/qa-dkrz/check_logs/Annotations/EC-Earth-Consortium_EC-Earth3-Veg_piControl_r1i1p1f1.json
18-    [
19-        {
20-            "DRS_7": [ "huss", "mrsos", "ps" ],
21-            "annotation": "Data record totally with _FillValue.",
22:            "tag": "R100",

heap-09-4xCO2/QC/qa-dkrz/check_logs/Annotations/EC-Earth-Consortium_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1.json
20-            "DRS_6": [ "3hr" ],
21-            "DRS_7": [ "huss", "mrsos", "ps", "tas", "tslsi", "uas", "vas" ],
22-            "DRS_8": [ "gr" ],
23-            "annotation": "Data record totally with _FillValue.",
24:            "tag": "R100",

heap-10-1pctCO2/QC/qa-dkrz/check_logs/Annotations/EC-Earth-Consortium_EC-Earth3-Veg_1pctCO2_r1i1p1f1.json
20-            "DRS_6": [ "3hr" ],
21-            "DRS_7": [ "huss", "mrsos", "ps", "tas", "tslsi", "uas", "vas" ],
22-            "DRS_8": [ "gr" ],
23-            "annotation": "Data record totally with _FillValue.",
24:            "tag": "R100",

It is similar to R200, just that one record is completely _FillValue, not zero.

The same list of 3hr variables is marked in the Period/*.range files of qa-dkrz as follows:

table_id: 3hr, number of variables: 22
clt_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr             --> 1850-01-01T03:00:00 - 2001-01-01T00:00:00    
hfls_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2001-01-01T00:00:00    
hfss_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2001-01-01T00:00:00    
mrro_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2001-01-01T00:00:00    
huss_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2000-12-31T21:00:00 <--
mrsos_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr               1850-01-01T00:00:00 - 2000-12-31T21:00:00 <--
pr_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                  1850-01-01T00:00:00 - 2001-01-01T00:00:00    
prc_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                 1850-01-01T00:00:00 - 2001-01-01T00:00:00    
prsn_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2001-01-01T00:00:00    
ps_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                  1850-01-01T00:00:00 - 2000-12-31T21:00:00 <--
rlds_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2001-01-01T00:00:00    
rldscs_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr              1850-01-01T00:00:00 - 2001-01-01T00:00:00    
rlus_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2001-01-01T00:00:00    
rsds_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2001-01-01T00:00:00    
rsdscs_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr              1850-01-01T00:00:00 - 2001-01-01T00:00:00    
rsus_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                1850-01-01T00:00:00 - 2001-01-01T00:00:00    
rsuscs_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr              1850-01-01T00:00:00 - 2001-01-01T00:00:00    
tas_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                 1850-01-01T00:00:00 - 2000-12-31T21:00:00 <--
tos_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gn             --> 1850-01-01T03:00:00 - 2001-01-01T00:00:00    
tslsi_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr               1850-01-01T00:00:00 - 2000-12-31T21:00:00 <--
uas_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                 1850-01-01T00:00:00 - 2000-12-31T21:00:00 <--
vas_3hr_EC-Earth3-Veg_abrupt-4xCO2_r1i1p1f1_gr                 1850-01-01T00:00:00 - 2000-12-31T21:00:00 <--

i.e. they're all marked at the end of the time range. It turns out that these are the variables in the 3hr table that have time stamps at 0:00, 3:00, ..., 21:00 every day. The variables that pass the test (all others in the 3hr table) have 1:30, 4:30, .., 22:30 as time stamps. Could that be a qa-dkrz problem, because in the latter case there are eight time steps that are clearly within a day interval, whereas the former case has seven plus one at midnight? Note that nctime does not complain about the range.

Note that in the scenario runs, the variables huss and tas (the only ones from the above list written in scenarios) do not trigger R100.

oloapinivad commented 5 years ago

I managed to compile and run the last commit from the QA-DKRZ code.

The test ran on the first 10 years of historical AOGCM, and now the most of the errors are gone: see the file attached. logfile_090719_AOGCM.txt The only one which perhaps deserve a little of attention is still CF_33e but overall I think we are at a very good point.

Re. the time windows, I have similar complains as Uwe reported in the .range file

table_id: 3hr, number of variables: 22
pr_3hr_EC-Earth3_historical_r4i1p1f1_gr                     1850-01-01T00:00:00 - 1860-01-01T00:00:00
prc_3hr_EC-Earth3_historical_r4i1p1f1_gr                    1850-01-01T00:00:00 - 1860-01-01T00:00:00
prsn_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1860-01-01T00:00:00
ps_3hr_EC-Earth3_historical_r4i1p1f1_gr                     1850-01-01T00:00:00 - 1859-12-31T21:00:00 <--
rlds_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1860-01-01T00:00:00
mrsos_3hr_EC-Earth3_historical_r4i1p1f1_gr                  1850-01-01T00:00:00 - 1859-12-31T21:00:00 <--
clt_3hr_EC-Earth3_historical_r4i1p1f1_gr                    1850-01-01T00:00:00 - 1860-01-01T00:00:00
rldscs_3hr_EC-Earth3_historical_r4i1p1f1_gr                 1850-01-01T00:00:00 - 1860-01-01T00:00:00
hfls_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1860-01-01T00:00:00
hfss_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1860-01-01T00:00:00
huss_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1859-12-31T21:00:00 <--
mrro_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1860-01-01T00:00:00
rlus_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1860-01-01T00:00:00
rsds_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1860-01-01T00:00:00
rsdscs_3hr_EC-Earth3_historical_r4i1p1f1_gr                 1850-01-01T00:00:00 - 1860-01-01T00:00:00
rsus_3hr_EC-Earth3_historical_r4i1p1f1_gr                   1850-01-01T00:00:00 - 1860-01-01T00:00:00
rsuscs_3hr_EC-Earth3_historical_r4i1p1f1_gr                 1850-01-01T00:00:00 - 1860-01-01T00:00:00
tas_3hr_EC-Earth3_historical_r4i1p1f1_gr                    1850-01-01T00:00:00 - 1859-12-31T21:00:00 <--
tos_3hr_EC-Earth3_historical_r4i1p1f1_gn                --> 1850-01-01T03:00:00 - 1860-01-01T00:00:00
tslsi_3hr_EC-Earth3_historical_r4i1p1f1_gr                  1850-01-01T00:00:00 - 1859-12-31T21:00:00 <--
uas_3hr_EC-Earth3_historical_r4i1p1f1_gr                    1850-01-01T00:00:00 - 1859-12-31T21:00:00 <--
vas_3hr_EC-Earth3_historical_r4i1p1f1_gr                    1850-01-01T00:00:00 - 1859-12-31T21:00:00 <--

The one with the arrows on the left are missing the first time step: for instance tos is coming from the NEMO data which surprisingly does not have it. Should we fill with an empty value the first time step? In principle should be the initial condition so that that value should be somewhere.

The ones with the arrow on the right are the instantaneous values so that I am wondering how are handling the LAST time step of the simulation: this is a subset of the data, but when the simulation will be over in at 2016-01-01T00:00:00 the question is if we should add the midnight tas to the 2015 file or not...

Unfortunately I also have this one, which is weird:

sidivvel_SImon_EC-Earth3_historical_r4i1p1f1_gn         --> 1850-02-01T00:00:00 - 1860-01-01T00:00:00

I am opening a separated issue for this, there are a few thing that does not sum up here...

oloapinivad commented 5 years ago

An update for ScenarioMIP, AOGCM configuration. I run the first 5 years of SSP5-8.5, cmorized and tested the output with QA-DKRZ. I am using the varlist.json from EC-Earth3 revision r6970

Luckily, there are not a lot of new results (see here http://wilma.to.isac.cnr.it/ecearth/diag/CMIP6/c585/infocmor/EC-Earth-Consortium_EC-Earth3_ssp585_r4i1p1f1.json). However, there is a set of variables that turns out having completely empty body: some of them have never been mentioned (as long as I remember fgsf6 and cfc11), and they should not be uploaded on the ESGF.

      {
            "DRS_6": [ "Omon" ],
            "DRS_7": [ "cfc11", "fgcfc12", "fgsf6", "ficeberg" ],
            "DRS_8": [ "gn" ],
            "annotation": "Variable <*> empty data body.",
            "example": "Variable <cfc11> empty data body.",
            "tag": "6_15",
            "severity": "Error"
        },
treerink commented 5 years ago

The variables fgcfc12, ficeberg have been omitted earlier from the json data request list by taking them off in drq2varlist.py from the task list in an exception statement

The variable fgsf6 will be added to that exception list, because the raw NEMO output of this field is still filled with invalid values in the *1m_*_opa_grid_T_2D.nc file.

For Oyr cfc11 (and the rest of the variables being part of the problematic Oyr cfc11-group) the cmorisation crashes #493, whileOmon cfc11 only contains invalid values (which is already the case in the NEMO raw output file *1m_*_opa_grid_T_3D.nc), therefore adding the Oyr cfc11 and the rest of this Oyr cfc11-group seems a practical solution catching two problems at once. Note that when rerunning genecec with this solution indeed the twofold cfc11 problems are addressed while the rest of the Oyr cfc11-group just drops out from the test-all tests (see #542), exactly what we need.

treerink commented 5 years ago

I think the issues (see https://github.com/EC-Earth/ece2cmor3/issues/504#issuecomment-512199333) which were not reported elsewhere are solved now.

Therefore I am closing this issue. Please reopen if there are still open issues which can be resolved, or maybe better open a new specific issue on that with reference to this issue.