NorESMhub / noresm2cmor

A command line tool for cmorizing NorESM output
http://noresmhub.github.io/noresm2cmor/
5 stars 16 forks source link

automatic selection of source_type #25

Closed IngoBethke closed 4 years ago

IngoBethke commented 4 years ago

source_type is currently defined in mod*.nml and set to default value AOGCM. However, for e.g. ocean-only and atmosphere-only simulation the value should be set to e.g. AGCM and OGCM, respectively.

It could be an idea to use the experiment specific required_model_components information (the first entry if multiple entries exist) from CMIP6_CV.json to automatically set source_type in order to avoid QC errors.

Furthermore, the value could be overridden as function "modeling_real" which is variable specific the value could be overriden (e.g. modeling_realm=ocnBchem -> source_type:=BGC), given that the corresponding source type exists in either "required_model_components" or "additional_allowed_model_components".

YanchunHe commented 4 years ago

The sourcetype is now manually set in the exp*.nml files according the values of "required_model_components" in CMIP6_CV.json. If multiple entries are specified in required_model_components, then they are specified in source_type separated by space, e.g., source_type="AOGCM AER"

An example of cmorized output file will be like: $ ncdump -h /tos-project1/NS9034K/CMIP6/.cmorout/NorESM2-LM/hist-piAer/v20190917wap_Emon_NorESM2-LM_hist-piAer_r1i1p1f1_gn_191001-191912.nc " :source_id = "NorESM2-LM" ; :source_type = "AOGCM AER" ; :sub_experiment = "none" ; :sub_experiment_id = "none" ; " commit 943dcc4b493c776d3ad4ba184d2268246c270e09

IngoBethke commented 4 years ago

Yanchun, are you sure that multiple values are allowed in source_type?

My understanding is that source_type should be exactly one of the values listed in required_model_components or additional_allowed_model_components, depending also on the variables (e.g. HAMOCC output should always be "BGC" and MICOM output either "AOGCM" or "OGCM").

IngoBethke commented 4 years ago

It seems you were right, sorry. In https://www.earthsystemcog.org/site_media/projects/wip/CMIP6_global_attributes_filenames_CVs_v6.2.6.pdf it says about source_type

"added partly because obs4MIPs defines this (e.g., “in-situ”); This should describe the model most directly responsible for the output (e.g., for dynamical downscaling output, it would describe the regional model, not the global model responsible for driving the regional model). Sometimes it is appropriate to list two (or more) model types here. Used in faceted searches."

Anyway, it does not seem to be an important metadata attribute. So I doubt it is worth to re-do cmor-ization just because of this attribute.

YanchunHe commented 4 years ago

The CMOR library does not complain during runtime that there are inconsistent components that are specficied. Therefore, I assume the tool can automatically filter out the required components from the space-separated list of components specified by source_type.

You can see, as I described above, the CMORized output shows multiple values of the components in the metadata of "source_type".

YanchunHe commented 4 years ago

Yes, I will use the updated "source_type" in the future CMORizaiton. And I don't plan to automatically read this information from CMIP6CV.json, which can take some time for me to implement. Manually setting in the exp*.nml is now straightforward and only takes half minute to do so. It is not clear for me what do you referred to by "Furthermore, the value could be overridden as function "modeling_real" which is variable specific the value could be overriden". If this is not critical, I would like to ignore this.

IngoBethke commented 4 years ago

I was wrong about the last part. I know believe that source_type should be the same for the entire experiment i.e. representative for the experimental configuration.

One thing you can consider is to also include values from additional_allowed_model_components, in particular "AER" and "BGC" because our atmosphere always uses prognostic aerosols and our ocean always runs with biogeochemistry. But it is up to you.

YanchunHe commented 4 years ago

OK, good point, thanks! I will add additional components when it is valid to do so.

YanchunHe commented 4 years ago

Added required_model_components and extra_allowed_model_components to CMIP6_template experiment namelist files. commit: 23f883ef7fee02990d857ec0b7a6a4d5d0204466 CMORized data v20190815 still use the old (fixed AOGCM) source_type, while the following version (v20190909, v20190917 and onward) use updated source_type.