Closed larsbuntemeyer closed 7 months ago
These are global attributes from CORDEX-CMIP5:
"Conventions",
"contact",
"creation_date",
"experiment",
"experiment_id"
"driving_experiment",
"driving_model_id",
"driving_model_ensemble_member",
"driving_experiment_name",
"frequency",
"institution",
"institute_id",
"model_id",
"rcm_version_id",
"project_id",
"CORDEX_domain",
"product",
"references",
"tracking_id"
There are some obvious attributes required by CORDEX, e.g., CORDEX_domain
, driving_*
, etc... but there is lots of open question here since the current CORDEX archive specifications are outdated (based on CMIP5 vocabulary), e.g.
model_id
to source_id
?institution_id
instead of institute_id
?driving_source_variant_label
instead of driving_model_ensemble_member
?driving_source_id
instead of driving_model_id
?driving_experiment_id
instead if driving_experiment_name
?activity_id
instead of project_id
?CORDEX-CMIP6 attribute | corresponding CORDEX-CMIP5 attribute | alternative CORDEX-CMIP6 attribute (using parent vocabulary) |
---|---|---|
source_id |
model_id |
|
institution_id |
institute_id |
|
driving_source_variant_label |
driving_model_ensemble_member |
parent_model_variant_label |
driving_source_id |
driving_model_id |
parent_source_id |
driving_experiment_id |
driving_experiment_name |
parent_experiment_id |
activity_id |
project_id |
See also here: https://docs.google.com/document/d/1h0r8RZr_f3-8egBMMh7aqLwy3snpD6_MrDz1q8n5XUk/edit
There seems to be a problem in trying to avoid the parent_experiment_id
attribute which seems to be required by the experiment_id
attribute in the CV. In general, if i look at the cmor source code, it seems to handle some attributes in a special way, i should write an extensive issue here...
UPDATE: This issue is solved here: https://github.com/PCMDI/cmor/issues/677
Update (30.01.2023)
Following our discussion, the preliminary set of global attributes:
Conventions
() - Climate and Forecast (CF) convention version (always 1.10 ???)activity_id
(CV) - activity identifier ( RCM, ESD and FPS ??? )contact
(free text) - contact information of the institution that is responsible for CORDEX simulations (avoid personal contact information) creation_date
(build rules) - date when file was created in format YYYY-MM-DDTHH:MM:SSZ (e.g., “2023-01-15T14:30:23Z”)domain
(CV) - name of the CORDEX region (link)domain_id
(CV) - an identifier assigned to each CORDEX region including a flag for resolution (link)driving_experiment
(CV) - short description of the CMIP6 experiments (link)driving_experiment_id
(CV) - root identifier of the CMIP6 experiments (link)driving_institution_id
(CV) - an identifier of the institution that is responsible for the driving CMIP6 simulation (link) driving_source_id
(CV) - CMIP6 model identifierdriving_variant_label
- “variant” label of the driving CMIP6 simulation (e.g. “r1i1p1f1” etc.) frequency
(CV) - sampling frequency (day, mon, 6hr, 3hr, 1hr) institution
(CV) - full name of institution that is responsible for CORDEX simulationsinstitution_id
(CV) - an identifier of institution that is responsible for CORDEX simulationsmip_era
(CV) - determine what cycle of CMIP defines experiment and data specifications (always ‘CMIP6’)product
(CV) - product type (‘model-output’ and ‘esd-output’ ??? )project_id
(CV) - project identifier (always CORDEX)realm
source
(some build rules) - full model name/version and components (aerosol, atmos, land etc.)source_id
(CV) - model identifier (link)source_type
(CV) - model configuration (RCM, ARCM ???)source_configuration_id
???tracking_id
(structured form with some CV) - unique file identifier (note 15 in CMIP6 DRS)variable_id
(CV) - variable identifier (link to the CMOR tables)comment
(not mandatory)history
(not mandatory)title
(not mandatory)references
(not mandatory)attribution
(CV) - fixed attribute that attributes driving data ??.variable_id
: (CV) is the short name of the variable. The name is taken from the CORDEX-CMIP6 Variable List or CMOR tables (… link …).
domain_id
: (CV) is the name assigned to each of the CORDEX regions and includes a flag for resolution as listed in (… link …).
driving_source_id
: (CV) is an identifier of the driving data. The name consists of a model identifier. For reanalysis driven runs this is the name of the reanalysis data (ERA5). For runs driven by CMIP6 model data this is the associated CMIP6 source_id, which can be found in the CMIP6 source id CV.
driving_experiment_id
: (CV) is either “evaluation” for the ERA5-driven experiment or the value of the CMIP6 experiment_id from the ScenarioMIP activity or “historical” for the historical experiment from CMIP. The values for experiment_id can be found in the CMIP6 experiment id CV.
driving_variant_label
: (CV) identifies the ensemble member of the CMIP6 experiment that produced the forcing data. It has to have the same value as the CMIP6 variant_label. For the evaluation experiment it has to be “r1i1p1f1”.
institution_id
: (CV) is an identifier for the institution that is responsible for generating and providing CORDEX simulations. All CORDEX Institutions must be registered to publish their simulations on ESGF. Instructions on how to register an institution and the actual state of the CV is found (… links …).
source_id
: (CV to register) is an identifier (acronym) of the CORDEX RCM. All CORDEX RCMs have to be registered to publish their simulations on ESGF. Instructions on how to register a RCM and the actual state of the CV is found (… links …).
source_configuration_id
: (free string) identifies simulations with different combinations of parameterization schemes or changes in parameters for existing schemes. This DRS element can also be used to identify technical reruns related for example to configuration errors. Major upgrades and improvements should be reflected in the RCMModelName.
frequency
: (CV) is the output frequency indicator: 1hr - 1 hourly, 3hr - 3 hourly, 6hr - 6 hourly, day=daily, mon=monthly, and fx=invariant fields.
Additional depedent attributes that give more detailed meta info will be derived and can be filled automatically.
here is the current table:
https://docs.google.com/spreadsheets/d/1xlbakqx3btSzT5Ke_q4GiQuK3Js80qNbQjSP4wUsAlY/edit?usp=sharing
I added grid
to required global attributes, so it is up to date with our published specs:
{
"required_global_attributes": [
"activity_id",
"contact",
"Conventions",
"creation_date",
"domain",
"domain_id",
"driving_experiment",
"driving_experiment_id",
"driving_institution_id",
"driving_source_id",
"driving_variant_label",
"frequency",
"grid",
"institution",
"institution_id",
"license",
"mip_era",
"product",
"project_id",
"source",
"source_id",
"source_type",
"tracking_id",
"variable_id",
"version_realization"
]
}
It's not clear to me what global attributes should be required for CORDEX-CMIP6, e.g., for CMIP6, we have