Closed aearamos closed 5 years ago
@aearamos Indeed this error occurs when updating to drq version 1.00.27, currently ece2cmor uses version 1.00.26 (So you were a bit too fast). However I will try to update ece2cmor3 soon. The easy fix and hopefully correct fix is by changing in taskloader.py
priority_colname = "Priority"
by
priority_colname = "Default Priority"
I am only a bit puzzled by the drq 01.00.27 release notes:
improved spreadsheets provided on web site: the "cmvme
" tables have a priority in column one: previously this was the CMOR variable default priority, which caused some confusion. Now changed to be the priority set by the requesting MIP for that experiment.
@treerink That's how I fixed my program and generated the files. Thanks!
About their issue, as I pointed out before, the same variable can have two different priorities depending on the MIP. e.g. I think salinity has a Default Priority of 1, but for DCPP its priority is 2. I think that's why it's different.
I also noticed that in generate-ec-earth-namelists.sh
, the drq2ppt uses the cmvmm_TOTAL file, while drq2file_def-nemo uses the cmvme_experiment file. I'm saying this because the first (cmvmm_TOTAL) has Default_Priority and the second (cmvme_experiment) has Priority in it. One fix is to use the same excel file for both functions and use the name in taskloader.py accordingly. I think the result will be the same, right?
Thanks
Hi, I posted a question at their open issue, as the strategy to identify all variables relies on this. So I hope it will become really clear. This issue is separate for some reason but related to this closed one.
The original issue here (changed name across drq versions) seems to be resolved. Shall we close this?
Actually a kept it open because it is rather relevant to have an answer from the data request people.
I will however change the subject, as I am myself also everytime looking why this one is not closed.
Hi Thomas,
I see. Since that ticket hasn't gotten any response since August last year, and indeed no response to your comment at all, let's see if we can update the question with the developments since then. Next I will then prod Martin about it again. Let's see if I understand your questions correctly. Summarizing your question over there:
The "aggregated spreadsheet" are the files labeled by cmvmm_, correct?
I don't know, but the cmvme
files also seem to carry aggregated information.
The column labeled "Default Priority" is a kind of default priority of a certain variable for this MIP if I understand correctly. This priority however can differ within a MIP for different experiments I understand.
Ok, couple of things to unpack. When talking about variables in the data request, we have to distinguish at least three different entities in the dreqML: MIP Variable [var]
, CMOR Variable [CMORvar]
, and Request variable [requestVar]
. None of them have directly anything to do with the netcdf files, ie variable names; this information comes later from the cmip tables.
var
s are very general. They really only fix cf standard name and units, and give some textual information. Despite the name (MIP Variable) they are not connected to any mip, except that they might give a textual clue about which mip came up with them originally, which is, in general, not an activity in the sense of CMIP6 controlled vocabulary, but can be something like CMIP5. They are also independent of tables.CMORVar
s. They link to a var
and add a lot of information. At this level we find the connection with the tables, the grids (both spatial and temporal), crucial processing instructions (like here), and the defaultPriority
. Note that this is simply the priority that was deemed appropriate by the creator of the entry. At this level there is no connection to activities (vulgo mips) yet, and there is no aggregation of their priorities going on.requestVar
s. These link the CMORvar
s with mips and requestVarGroup
s which are collections of variables that the mip thinks are connected. On top of that requestVar
s also give the priority that the mip thinks this variable should have in this requestVarGroup
.What does it all mean? Well, I think the concept of default priority is independent of mip. Variables have default priorities before they are assigned to mips. The good news is that default priority does not vary with mips. It simply is an indicator of how important the variable overall has been considered by someone. On the other hand we have the concept of priorities within requestVarGroup
s (and yes, the same variable in the same table can have different priorities in different groups within the same mip, checkout (Lmon, baresoilFrac)
).
Noting further that the cmvmm
files talk of Default Priority
with a comment of
Default priority (generally overridden by settings in "requestVar" record)
whereas the cmvme
files talk about Priority
with a comment of
Lowest priority value set in request for this variable for this experiment
seems to suggest that the cmvme
files carry the more relevant aggregate priority.
I hope/expect then that the default priority gives the highest occurring priority for this certain variable which is encountered among the experiments within one MIP (where 1 is the highest priority and 3 is the lowest priority), is this correct?
No, see eg c4PftFrac. I can't say for sure if this is intentional, but it might be and is almost certainly going to occur for some variable. As said above: Default priority as a concept is not applicable within a mip.
If this is not the case, I at least hope that the variables ending up in this cmvmm_ files are selected on this criterion?
Yes, this seems to be the case.
I have often seen higher numbers (2 and 3) in the "Priority" column in the cmvmm_ files in data request up to 01.00.26 while I requested for priority 1.
Probably that changed at some point (at 01.00.27?) in the sense that the cmvmm
files don't contain a Priority column anymore; only a Default Priority column.
At SMHI we use the cmvme_ae.c4.cd.cf.cm.co.da.dc.dy.fa.ge.gm.hi.is.ls.lu.om.pa.pm.rf.sc.si.vi.vo_historical_1_1.xlsx
files at the moment, and all things considered, I don't see a reason to change that. However, I also don't see why the cmvmm
files would be any worse. The requestVol
files are certainly not useful for this purpose.
I hope this long dribble helps a bit in the clarification. Cheers Klaus
After a good nights sleep and studying again the dreqPy documentation, 5.8 it seems clear that the cmvmm
files are aggregating by mip, whereas the cmvme
files are aggregating by experiment. I think we want the latter because we want to cmorize the output of one experiment when that is finished and not try to cmorize partially a number of experiments according to the involved mips.
For the same reason I suggest to use the cmvme
file that contains all the mips, i.e. the long filename mentioned above. Then the last question is about tier and priority.
Do we want to consider also lower tier and priority variables or only 1 and 1?
It seems we have moved on from this discussion. @treerink you think this could be closed now?
Closing this issue after adding it to the Cold case issues.
Hi everyone,
I just updated my tables and drq (1.00.27) and ran some tests to generate the ppt and xml files for different MIPS.
When running
./generate-ec-earth-namelists.sh CMIP piControl 1 1
I get the following error right after drq2ppt:Traceback (most recent call last): File "./drq2ppt.py", line 172, in <module> main() File "./drq2ppt.py", line 162, in main taskloader.load_targets(args.vars, active_components={"ifs": True, "nemo": False}) File "/home/Earth/aamaral/cmorize/ece2cmor3/ece2cmor3/taskloader.py", line 55, in load_targets targetlist = load_targets_excel(varlist) File "/home/Earth/aamaral/cmorize/ece2cmor3/ece2cmor3/taskloader.py", line 129, in load_targets_excel priority_index = row.index(priority_colname) ValueError: 'Priority' is not in list
After some debugging, I noticed that whenever drq2ppt calls the
./drq2ppt.py --vars cmip6-data-request/cmip6-data-request-m=CMIP-e=piControl-t=1-p=1/cmvmm_CMIP_TOTAL_1_1.xlsx
table, one of the labels in the table is "Default Priority" and not "Priority", which caused the error. I changed the labels in each tab by hand and it generated the ppt and xml files just fine.How can this issue be fixed? I only noticed it now and didn't have this problem before.