WCRP-CMIP / CMIP6Plus_CVs

Controlled Vocabularies (CVs) for use in CMIP6Plus
Creative Commons Attribution 4.0 International
3 stars 4 forks source link

Removal of `product` attribute from CVs #45

Closed matthew-mizielinski closed 6 months ago

matthew-mizielinski commented 6 months ago

The product field is not used in the CVs and could be removed.

This is an issue to provide a branch of the CVs for (a) testing changes in CMOR and (b) for eventual update to these CVs

durack1 commented 6 months ago

Just noting where things have been defined across other projects:

input4MIPs (in table header, updated to input4MIPs): https://github.com/PCMDI/input4MIPs-cmor-tables/blob/master/Tables/input4MIPs_Omon.json#L11 obs4MIPs (in table header, unchanged from model-output): https://github.com/PCMDI/obs4MIPs-cmor-tables/blob/master/Tables/obs4MIPs_Omon.json#L11

For input4MIPs the product is defined in the user_input.json which overwrites the table header entry. So we need to ensure that CMOR 3.8.x can deal with product being removed from the table header, and it needs to be provided by either project_CVs or user_input.json

ping @matthew-mizielinski @wolfiex @taylor13 @mauzey1

mauzey1 commented 6 months ago

So we need to ensure that CMOR 3.8.x can deal with product being removed from the table header, and it needs to be provided by either project_CVs or user_input.json

Will this mean that the product attribute will be removed from mip-cmor-tables' headers?

durack1 commented 6 months ago

@mauzey1 yes, I believe so. Apologies for the whiplash on this product discussion, I did not spend the time to think this through, and @taylor13 has noted that it's used to define what the data is, in addition to the project itself.

So the path forward will be to remove this from the table headers, so that table information can be used across projects (without the product being defined). And we'll depend upon either the CVs (in the case of a product consistent dataset) or user_input.json (in the case of a varying product dataset) to define what this is.

We'll likely need a product CVs defined in mip-cmor-tables to capture all valid entries: model-output, observations, reanalysis, derived, ...

durack1 commented 6 months ago

Just trying to reconnect disparate threads, also see discussion at https://github.com/PCMDI/cmor/issues/723#issuecomment-1912797439 - reproducing below

product is meant to distinguish model_output from, for example, observations and forcing_dataset, so I think it should be included as a global attribute in the file and should be harvested by ESGF into its catalog. What would be the rationale for removing it?

Thanks for chiming in @taylor13. You make a very good point about having this to delineate between projects (or is it mip_era, or activity_id). We do need to figure out how to manage all this, if we had CVs that captured these options in the mip-cmor-tables (highest level "multi-verse") repo it would make more sense to me - so model-output, observations, forcing which was never defined in the CMIP6_CVs, in input4MIPs and obs4MIPs we have derived, observations, reanalysis

@taylor13

Yes, projects/activity_id/mip_era have sometimes been used to indicate something about "project", and under each project we can have multiple output types. (In CMIP5 we distinguished between output1 and output2, but folks found that confusing.). The terms should be clearly defined in CVs, but the allowed values would be project-specific.