Closed oloapinivad closed 5 years ago
Some comments per annotation:
_FillValue
and missing_value
should be allowed.comment:
should be left out. However, I don't consider this critical.lat
, lon
, and time
. However, since these variables are ocean variables, lat
and lon
are not used and since they are instantaneous, time
doesn't have bounds. Are these the only two instantaneous ocean fields that we produce? Then all the data is essentially correct, but the bnds
dimension really could be removed.ncdump -h
for two files with this problem and attach them here, best in a .txt
file?Thanks a lot for the complete analysis!!! I reply only where I have a bit of competence ;-)
- Seems to be a bug in qa-dkrz. CF Section 4.1 clearly states that degrees_north is the correct unit for latitude. @oloapinivad, could you file a bug report with qa-dkrz?
Ok, I will report it.
- I think this means that one time step is masked completely. Probably not good. Have a closer look at these variables.
- Everything masked. With important variables (psl, ps, tas, ta,...) as well. Not good. Check closely. Note the tables!
At least for 7 this is discussed in #501. It seems to come from the first time step missing. Probably fixing that issue fill fix also 6 since all the variables but sitemptop
comes from IFS.
- Interesting. @oloapinivad, could you produce the
ncdump -h
for two files with this problem and attach them here, best in a.txt
file?
I tried by myself without spotting anything. The txt is here attached! msftyz.txt
- No idea. Maybe a good idea to check shapes of all the files.
- See 11.
Checked the file shape without finding any strange behaviour. No clue.
- These are flux corrections and could be correct. I don't know if we correct the flux in any way.
Well, these are never used but in DCPP experiments. We don't have any flux correction. They are discussed in #471, I think they should - at least - not be uploaded on the ESGF.
Thanks a lot for the complete analysis!!!
:beers:
I reply only where I have a bit of competence ;-)
So, everywhere. Good.
- Seems to be a bug in qa-dkrz. CF Section 4.1 clearly states that degrees_north is the correct unit for latitude. @oloapinivad, could you file a bug report with qa-dkrz?
Ok, I will report it.
Great, thanks! Do you have an issue number?
At least for 7 this is discussed in #501. It seems to come from the first time step missing. Probably fixing that issue fill fix also 6 since all the variables but
sitemptop
comes from IFS.
Ok, I put the issue number in the comment above so we can track it.
- Interesting. @oloapinivad, could you produce the
ncdump -h
for two files with this problem and attach them here, best in a.txt
file?I tried by myself without spotting anything. The txt is here attached! msftyz.txt
Hm. I also don't find anything. Could you perhaps also share the pertinent check log? There we will have the individual annotations and not only the summary, allowing us, among other things, to identify the files where this originates.
- No idea. Maybe a good idea to check shapes of all the files.
- See 11. Checked the file shape without finding any strange behaviour. No clue.
Let's have a look at the check log also for this.
- These are flux corrections and could be correct. I don't know if we correct the flux in any way. Well, these are never used but in DCPP experiments. We don't have any flux correction. They are discussed in #471, I think they should - at least - not be uploaded on the ESGF.
Hm, I am not so sure about that. It seems a bit wasteful, true, but it is requested and it seems somebody in the EC-Earth consortium thought it should be provided, so why not? We just need to be aware that this is legitimately all zero.
So I had a crash of conda and reinstall everything from scratch. I re-run the QA-DKRZ and now the cmor error in 5. is gone (which was related to PrePARE when I looked at check-logs) but it seems to have "evolved" in a problem with udunits that is now everywhere in my logs (I have asked for clarification here https://github.com/IS-ENES-Data/QA-DKRZ/issues/14)
In https://github.com/IS-ENES-Data/QA-DKRZ/issues/14 you can find the issue with dimensions discussed in 2.
Attached you can find the check_logs from the most recent QA-DKRZ runs: data have been cmorized with version 1.2.0 and the fix of #501 included. The problems of 10. ,11. and 12. seems to be occurring to all files. EC-Earth-Consortium_EC-Earth3_historical_r4i1p1f1.log
the problem in 7. seems gone after the #501 fix, but we still have some report similar to 6. I visually inspected those files and they seem ok at first glance fine, could it be that QA-DKRZ is complaining about the present of missing values (these fields - at least the most of them - are masked over ocean)?
BTW, I think it would be better to track each warning/error of QA-DKRZ separately since it is hard to keep track of all of them here, considering the evolving reports of QA-DKRZ and versions of ece2cmor (I don't even mention yet the different family of errors I get from the Veg version).
Hm. The udunits error smell of an installation problem since the unit system is part of the normal udunits installation. Can you try something like the following in an interactive python session?
from cf_units import Unit
print(Unit('K'))
@oloapinivad, check the UDUNITS2_XML_PATH
path in your ~/.qa-dkrz/config.txt
, I had this misconfigured at some point and saw the same error. Note that the file can have multiple sections, particularly if you re-installed qa-dkrz, make sure the section that is actually used in your case has the correct values.
I think the issue about cell_method
s is also a qa-dkrz bug; I reported it as IS-ENES-Data/QA-DKRZ#15.
w.r.t.
- Solved by deactivating CF_732a.
This can be done by editing ~/.qa-dkrz/QA_TABLES/tables/projects/CF/CF_check-list.conf
. Either comment out:
# Attribute cell_methods should have key-word <comment:> omitted, if there is \
# no standardised information & CF_732a
or deactivate it like this:
Attribute cell_methods should have key-word <comment:> omitted, if there is \
no standardised information & CF_732a,D
I'll test if we can put all our ignore's in one separate config file.
@uwefladrich thanks a lot!!! I didn't have any UDUNITS2_XML_PATH
in the ~/.qa-dkrz/config.txt
but adding it now PrePARE is no longer failing, the udunits error is gone!
@oloapinivad, nice! That might influence some of the other errors, namely 3. Could you update us with a fresh round of logs once it is finished?
I was about to post it!
Actually it seems that both 2 and 3 are now gone! So the degrees_north was connected to a failure in udunits? Of course 5 and 7 are gone too.
BTW now PrePARE is working fine so we have the wap and zg failure reported in #462
I think it tried to parse the units string with udunits and then perform a check like ut_are_convertible. But because it didn't load the unit system this didn't work and the correct units couldn't be compared successfully with the found units.
One comment on the checksum errors (11 and 14): The problem might be related to the fact that some variables in different tables are reported on different sets of pressure levels with the coordinate variable always being called plev
. This is organized in the tables by using different dimensions
in the variable_entries
like plev19
, plev7h
, ... These can be found in CMIP6_coordinates.json
where they all have out_name
plev
. The problem for me is that I don't have files where this is occuring and that the error message is rather unclear (which is reported as IS-ENES-Data/QA-DKRZ#18).
@zklaus here a couple of ncdump
output from a ua
file from day
and Eday
table. The number of vertical levels is indeed different. The full files are quite heavy, but I can provide them if you need them.
< netcdf ua_Eday_EC-Earth3_historical_r4i1p1f1_gr_18510101-18511231 {
---
> netcdf ua_day_EC-Earth3_historical_r4i1p1f1_gr_18510101-18511231 {
4c4
< plev = 19 ;
---
> plev = 8 ;
44c44
< ua:history = "2019-06-26T09:19:14Z altered by CMOR: Reordered dimensions, original order: lat lon plev time." ;
---
> ua:history = "2019-06-26T09:26:28Z altered by CMOR: Reordered dimensions, original order: lat lon plev time." ;
56c56
< :creation_date = "2019-06-26T09:19:38Z" ;
---
> :creation_date = "2019-06-26T09:26:44Z" ;
66c66
< :history = "2019-06-26T09:19:38Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards.;\n",
---
> :history = "2019-06-26T09:16:16Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards.;\n",
97c97
< :table_id = "Eday" ;
---
> :table_id = "day" ;
99a100
> :tracking_id = "hdl:21.14100/f87be62c-da3a-4e21-a11c-d7d7dbd8d33c" ;
104d104
< :tracking_id = "hdl:21.14100/a1012c43-b60b-47f9-9b8c-9ac44880ca6e" ;
ua_Eday_EC-Earth3_historical_r4i1p1f1_gr_18510101-18511231.txt ua_day_EC-Earth3_historical_r4i1p1f1_gr_18510101-18511231.txt
@h-dh provided some clarification in IS-ENES-Data/QA-DKRZ#18. @oloapinivad, maybe you can read that and have a look at your results folder?
So following the discussion https://github.com/IS-ENES-Data/QA-DKRZ/issues/18 we can say that 11 and 14 are not problem in our data but rather a tuning of QA-DKRZ.
To avoid the warnings we could set the QA-DKRZ with something like PT_PATH_INDEX=2,3,6
so that it would be the same for all the experiments and ensemble members but it will distinguish on the tables. Do you agree?
Probably yes, but let's wait just a bit longer to see what precipitates over there.
- IS-ENES-Data/QA-DKRZ#17 As discussed in a different issue I think this is ok since these are instantaneous variables.
Number 9 hasn't shown up with a recent version of qa-dkrz
from github. Problem solved?
My suggestion for PT_PATH_INDEX
is as follows
#CMIP[1]/EC-Earth-Consortium[2]/EC-Earth3-Veg[3]/historical[4]/r1i1p1f1[5]/Omon[6]/volo[7]/gn[8]/v20190605[9]
PT_PATH_INDEX=2,3,6,7,8
Rationale:
Does that make sense? Do you agree?
I totally agree on 2,3,6. Do we have any experience of cases for which 7 and 8 are needed?
As long as I understand the distinction is made on frequencies and variables name anyhow, so probably 7 is redundant (we should have encountered the issue already I think since now we have 2,3,4 only). 8 I agree is still possibile, but I don't know if it is ever happening. However, we can keep it to stay on the safe side.
Wrt T_9, this had been addressed already in 2018. Not sure why the error showed up here at all. Maybe something with the detection of the instantaneous property went wrong?
@oloapinivad, if you can confirm that this error isn't occurring for you any longer, I think we can close the upstream issue.
Wrt T_9, this had been addressed already in 2018. Not sure why the error showed up here at all. Maybe something with the detection of the instantaneous property went wrong?
@oloapinivad, if you can confirm that this error isn't occurring for you any longer, I think we can close the upstream issue.
I have the conda version qa-dkrz-0.6.7-59,master-5fdfa30
and I have T_9
there. So probably it will be solved in a new release since in the Github is gone.
I have the conda version
qa-dkrz-0.6.7-59,master-5fdfa30
and I haveT_9
there. So probably it will be solved in a new release since in the Github is gone.
I can confirm that I had the error with the conda version (as mentioned above) and that it disappeared when switching to the github install.
Btw. I am putting together a short install-from-github howto, coming soon.
I totally agree on 2,3,6. Do we have any experience of cases for which 7 and 8 are needed?
As long as I understand the distinction is made on frequencies and variables name anyhow, so probably 7 is redundant (we should have encountered the issue already I think since now we have 2,3,4 only). 8 I agree is still possibile, but I don't know if it is ever happening. However, we can keep it to stay on the safe side.
Just to comment that we have variables if IFS and TM5 which are on a different grid, but not sure if the case 8 applies since they are also on different tables. But that might be one instance, where it matters.
I forked the repository and added tags where the conda file conda-recipes/qa-dkrz/meta.yaml
changes, which is the way that the tags that are present in upstream have been handled and which agrees with the hash as given in @oloapinivad's version output. You can find it at https://github.com/zklaus/QA-DKRZ
One thing to note is that there has been a change to the detection of instantaneous by means of "...Pt" frequencies, which is likely responsible for the absence of T_9 and could also impact the checksum stuff since perhaps the frequency detection didn't work properly before.
In light of this and seeing as this issue is getting really long I propose that @oloapinivad also moves to the github version (following @uwefladrich's howto) and then we close this issue and move forward with a fresh one.
To summarize what we have learned so far:
Everything else has either been resolved now or should resurface with the github version. I would also like to use the github version to determine the correct PT_PATH_INDEX
because the changed frequency detection could impact this.
Ok?
So I installed the github version using the Uwe's to do list (even if with some difficulties) and run a fresh QA-DKRZ with CHECK_MODE=TIME,DATA,CNSTY,CF,DRS,DRS_F,DRS_P
that produces a new report. Of course CV
is still missing.
Annotations: http://wilma.to.isac.cnr.it/ecearth/diag/CMIP6/chis/infocmor/EC-Earth-Consortium_EC-Earth3_historical_r4i1p1f1.json Logfile: http://wilma.to.isac.cnr.it/ecearth/diag/CMIP6/chis/infocmor/EC-Earth-Consortium_EC-Earth3_historical_r4i1p1f1.log
I have a considerably reduced set of errors (T_9 is gone), but a few new ones. :-) We can go for a new issue I think...
@zkaus, actually your version is the same as the one h-dh so I am wondering if you pushed your commit.
A note on QA-DKRZ speed: it took 50 minutes with 18 cores and 5 years (i.e. about 1350 files)
@zkaus, actually your version is the same as the one h-dh so I am wondering if you pushed your commit.
I didn't actually add any commits. I merely added tags to existing commits. So if you add my repo as a remote
git remote add zklaus https://github.com/zklaus/QA-DKRZ
and fetch everything, including tags
git fetch --all --tags
you should find the new tags
[a002160@c21236 QA-DKRZ]$ git tag
v0.6.5-31
v0.6.5-32
v0.6.5-34
v0.6.5-35
v0.6.5-36
v0.6.5-37
v0.6.7-30
v0.6.7-31
v0.6.7-49
v0.6.7-50
v0.6.7-51
v0.6.7-53
v0.6.7-54
v0.6.7-55
v0.6.7-56
v0.6.7-58
v0.6.7-59
v0.6.7-60
that will also show up as yellow labels in gitk
.
I now opened the new issue #504 to track further progress.
I have just run the QA-DKRZ on 5 years of simulation from the AOGCM historical. You can find the json report produced here: http://wilma.to.isac.cnr.it/ecearth/diag/CMIP6/chis/infocmor/EC-Earth-Consortium_EC-Earth3_historical_r4i1p1f1.json.old Almost all of them are warnings, and it is unclear to what extent they are limitation of the tool or caveats in our cmorised output.
The only error is reported involves the empty field and is discussed in #471. Also the missing time boundaries in #472 seems to me relevant.
On top of these two, we need to decide which from these warnings we would like to address with the current version, in the future or maybe never: many of them are complaints about the CF data format, and as we discussed with @zklaus, @klauswyser and @uwefladrich they may be not be so relevant. The
unused time dimension
and thechecksum of layout have changed
ones seems a bit more worrisome, but please let me know what you think about.Best, Paolo