WCRP-CMIP / CMIP6_CVs

Controlled Vocabularies (CVs) for use in CMIP6
Creative Commons Attribution 4.0 International
155 stars 79 forks source link

experiment_id: es-doc, data request, CV "description" inconsistencies #339

Closed taylor13 closed 6 years ago

taylor13 commented 7 years ago

@martinjuckes - I would like to consider Martin's email (copied below) as an issue.

Hello Karl, Charlotte,

I'm trying to standardise the long names (titles) and descriptions of experiments in the data request. Once the experiment labels are synchronised, we have the following additional bits of information for each experiment:

data request: title, description es-doc: long_name, description CMIP6_CVs: experiment, description

For example, for piClim-ghg we have:

In es-doc: long_name: "effective radiative forcing by present day greenhouse gases", description: "An uncoupled (atmosphere and land) experiment with interactive vegetation in which sea suface temperature (SST) and sea ice concentrations (SIC) are fixed at model-specific pre-industrial control climatology. Greenhouse gases set to present day (2014) values, other forcing agents are specified at pre-industrial values. Run for 30 years.",

In CMIP6_Cvs: "description":"As in RFMIP-ERF-PI-Cntrl but with present-day greenhouse gases", "experiment":"effective radiative forcing by present-day greenhouse gases".

I would like to use the ES-DOC "description" string for the data request description, as this appears to be generally complete and clearly stated. For the title, I'll take the CMIP6_CV "experiment" string, which appears to match the ES-DOC long_name. For consistency with variable long names I'd prefer to use "title" capitalisation (e.g. "Effective Radiative Forcing by Present-day Greenhouse Gases") -- but is there a reason why lower case has been used in the CVs and ES-DOC?

regards, Martin

taylor13 commented 7 years ago

@martinjuckes

Hi Martin and Charlotte,

I'm glad the three of us are converging on a consistent set of metadata for the "experiments".

I think we should minimize the dependency of the data request on es-docs. Both es-docs and the data request should obtain all available information from the CVs directly. Of course the data request also has to get input from the MIPs as to which variables they want (and all the metadata associated with the variables), but otherwise it should always be checked for consistency with the CVs (including, for example, table names, frequency options, realms, experiment and sub-expeirment info., activity_ids). Rather than allow inconsistencies, a request should be made to modify the CV. Similarly, I don't think es-doc should depend on the data request, but please correct me if I am wrong about this.

The experiment_id and experiment found in the CMIP6_CVs are both consistent with the global attributes that will be found in the files, so I think that keeping es-doc and the data request consistent with them is essential. It would be even nicer if they were referred to in the same way by everyone instead of "title", "long_name" and "experiment" being aliases, but I guess we can live with that.

experiment_id and experiment have both undergone considerable scrutiny and the MIPs signed off on what is in the CMIP6_CVs, so let's not alter them. They were used in several of the GMD experiment description papers, so I would prefer not to even modify the "capitalization" scheme (which for us would be considerable work). Charlotte, are your values of experiment_id and experiment (i.e., long_name) identical to the CMIP6_CVs now? I note that Martin's example shows a difference of a "-" appearing in the CV version in "present-day". Was that a typo Martin?

As for "description", the values in the CV have undergone less scrutiny, but the MIPs did approve them. If Charlotte has circulated and gotten sign-off from the MIPs for more complete and clearer descriptions in es-doc, then I think we should consider replacing what's in the CV with what's in es-docs. Then we'll all be consistent. This will be quite a lot of work for PCMDI, so let's not do it unless the descriptions have been cleared by the MIPs.

thanks, Karl

martinjuckes commented 7 years ago

Hello Karl,

I'm happy to take the "experiment" string as-is from the CV to avoid extra work.

I want to check that start years and durations are consistent with ES-DOC, as I know that there has been some scrutiny there: cross-checking can help to highlight problems with some of these technical details.

The difference ("present-day" vs "present day") is not a typo ... that is what is in the version of ES-DOC experiment decsriptions that I have.

@charliepascoe replied to me by email "Hi Martin,

The ES-Doc long names should match the CMIP6 CV experiment titles. So it makes sense to use the CMIP6-CVs as your source in the experiment title in the data request. I don't have a view re capitalisation or otherwise.

I believe the intention is to use the ES-Doc experiment descriptions across the board so it would make sense for you to use the ES-Doc text in the data request. My aim when writing the experiment descriptions was that they should be understood as stand-alone entities. In the case of piClim-ghg the reference experiment RFMIP-ERF-PI-Ctrl (which has since been renamed: piclim-control) is listed as one of the related experiments."

I support the idea of using the ES-DOC descriptions: the description strings in the CV do not add significantly to the information in the long name/experimen attribute. In both cases there is a problem with the inlcusion of old experiment names in the descriptive text. I've started looking at a spell checker which may help us to to catch these.

taylor13 commented 7 years ago

I generally agree that the CV "description" usually adds little to the CV "experiment". If Charlotte is working toward better "descriptions", we should either 1) replace the CV descriptions with the better ones, or 2) eliminate the "descriptions" from the CV @charliepascoe What do you recommend?

durack1 commented 7 years ago

@taylor13 @martinjuckes @charliepascoe I'd advocate that we update the CMIP6_CVs description, as fragmenting information across multiple sources/repos, rather than centralizing this in one place isn't a good idea in my opinion. The CMIP6_CVs repo should be the definitive source of all the information that it is capturing, the other repos use these and augment where necessary. The schematic in the first slide in https://goo.gl/TdzjRC linked in #28 shows the current dependencies, at least how it makes sense to me

taylor13 commented 7 years ago

Clearly not all documentation of the experiment protocol and models will be found in the CV and I'm not sure why it is needed in the data request, so another option would be to put the "description" in es-doc only along with other documentation, but I'm not sure this would be best.

charliepascoe commented 7 years ago

The experiment descriptions give a useful overview for each experiment. I'd advocate for harvesting the es-doc experiment descriptions into the CMIP6-CV.

taylor13 commented 7 years ago

@charliepascoe thanks for weighing in on this. After further reflection and discussion with @durack1, I agree. Paul can tell you how to pass us your descriptions. I'll review these and compare with our descriptions. We may have to iterate with a few of the MIPs, but then we should be able to replace the CV's current descriptions with yours, and we can designate the CMIP6-CV (experiment_id) as the reference for "description". @momipsl can then do his magic and make sure es-docs remains consistent with the CV (as it inevitably evolves). And @martinjuckes should also obtain the "description" from the CV because that would avoid any delay in updating following any change to the CV (not having to wait for it to propagate to es-docs).

If after thats, you find changes are needed to "description", then you (or whoever is requesting the edits could submit an "issue" to the CV github site). Once we make the change to the CV, it will then propagate to es-docs and the data request.

Please let me know if you or anyone else would prefer to proceed differently.

thanks, Karl

durack1 commented 7 years ago

@charliepascoe @momipsl can you point us to the json files where these descriptions can be found?

I also wonder what the status of the "sign off" by the MIP chairs are on these, as we want to make sure that we are propagating validated info. @taylor13 went through the painstaking process of getting sign off in the first iteration, and in some instances with MIP chair guidance we have updated entries to correct problems that existed in the GMD papers themselves..

durack1 commented 7 years ago

@martinjuckes copied from duplicate #328

The experiment 1pctCO2-rad currently has the definitiion "Radiatively-coupled specified concentration simulation in which CO2 increases at a rate of 1% per year until quadrupling", which is slightly misleading because the 1pctCO2 experiment also has radiatively coupled CO2. The distinctive feature in 1pctCO2-rad is that there are two CO2 concentrations in play, one involved in the carbon cycle and one linked to the radiation code. 1pctCO2-rad and 1pctCO2-bgc form a pair: in the first the radiatively coupled CO2 is increasing, while the carbon cycle CO2 remains at control levels; in the 2nd this is reversed. After discussion with Chris Jones, I suggest the following descriptions (designed to make sense when read in isolation):
1pctCO2-rad: "1 percent per year increasing CO2 coupled to the radiation code with control CO2 concentrations applied in the carbon cycle",
1pctCO2-bgc: "1 percent per year increasing CO2 coupled to the carbon-cycle code with control CO2 concentrations applied in the radiation".
durack1 commented 7 years ago

@charliepascoe are those descriptions finalized yet?

durack1 commented 6 years ago
From: Charlotte Pascoe
Date: Tuesday, January 23, 2018 at 2:48 AM
To: "Durack, Paul J."
Cc: "Taylor, Karl E."
Subject: Re: CMIP6_CVs - source_id cleanup

Hi Paul,

It'll be here:
https://github.com/ES-DOC/esdoc-docs/tree/master/cmip6/experiments/spreadsheet

Note that Mark says he'll actually be processing the current version of the experiments spreadsheet this week and expects to put the latest version in the github repo on Thursday 25th.

C
durack1 commented 6 years ago

@charliepascoe @momipsl @davidhassell there are now two more MIPs that have been endorsed by the CMIP panel CDRMIP and PAMIP, and these are now included in CMIP6_activity_id.json.

As the new experiments will need to be registered, and propagated across to ES-DOCS, I'll close this issue as a duplicate of #455 and this descrription syncronizing can be revisited once all the registered content is in place in this repo.

Please feel free to comment on the open issue #455 to keep the ES-DOC and CMIP6_CVs syncronization discussion alive