Closed cosunae closed 1 year ago
I wonder how the values of scaleFactorOfFirstFixedSurface/scaledValueOfFirstFixedSurface and scaleFactorOfSecondFixedSurface/scaledValueOfSecondFixedSurface are translated within the typeOfLevel concept. This is for instance interesting, when a vertical integration between two surfaces with typeOfFirstFixedSurface = typeOfSecondFixedSurface=150 is performed. When I performed tests for IDPI, I noticed that the information on one of the two surfaces was lost. Maybe example of integrating between two surfaces of level type 150 is in practice irrelevant, but we have an example product, which includes the integral mean of potential vorticity in a layer defined by two isobaric surfaces. When looking at the eccodes FAQs, I found out that the namespace "vertical" contains the keys "typeOfLevel", "level", "bottomLevel", "topLevel", and "pv" (see https://confluence.ecmwf.int/display/UDOC/What+are+namespaces+-+ecCodes+GRIB+FAQ). As far as I remember, GRIB_topLevel and GRIB_bottomLevel were not automatically set when reading vertical grid information with cfgrib. So this has to be double-checked.
Note that currently definitions/grib2/localConcepts/lssw points to definitions/grib2/localConcepts/edzw, i.e., we follow the COSMO GRIB2 policy here.
"All attributes that are named like GRIB are keys decoded from grib": One hast to check carefully, which of these GRIB keys are coded keys acccording to the GRIB2 standard, and which ones are derived or computed by eccodes. Nearly all GRIB_ keys in the list above are derived keys. I already mentioned the problems of the reduced information in GRIB_typeOfLevel (if topLevel and bottomLevel are actually not set, as I remarked above). I am also missing the information stored in the resolutionAndComponentFlags. When dealing with the horizontal wind components U and V of COSMO, these flags tell you if these components are defined with respect to the native rotated lat/lon grid, or to the geographic lat/lon grid. In fact, I do not see any information on that in the definition of the namespace vertical:rotated_II at https://apps.ecmwf.int/codes/grib/format/edition-independent/1/17/ But well, one may argue that this information is parameter-dependent. So I wonder where it is reflected. I wonder that "long_name" is not set to the value of "GRIB_name", as indicated in dataset.py:encode_cf_first, respectively, what are the conditions that this is the case.
"Historically Fieldextra kept a dictionary with all variables, mapping the name to the combination of parameter keys that represent in grib that variable": The mapping between short name and the tuple of GRIB2 keys that is defining a parameter is / or should be the same as in eccodes-cosmo-resources/definitions/grib2/localConcepts/
Thanks @petrabaumann, yes, we need to have a look more in detail to how the scaleFactorOfFirstFixedSurface/scaledValueOfFirstFixedSurface should be mapped into the typeOfLevel. It might be that the eccodes concept is not complete in edzw
Proposal for conditional loading of grib keys here: https://github.com/MeteoSwiss-APN/icon_data_processing_incubator/blob/6ec3da0075d57e6102683ead5cf238697eed18e9/idpi/src/operators/flexpart.py#L32 A clean solution in cfgrib is currently not possible, therefore it is hacking cfgrib.
Resolution and component Flags is here: https://apps.ecmwf.int/codes/grib/format/grib2/ctables/3/3
cfgrib.open_datasets produces xarray datasets that follow the CF data model and conventions. They look like:
where the Dataarrays look like:
All attributes that are named like GRIB_ are keys decoded from grib.
Any DataArray or Dataset that follows this data model can in principle be written into a grib file by calling canonical_dataset_to_grib
https://github.com/ecmwf/cfgrib/blob/8578f1012974e88d962f3fa13c0fcca973e49114/cfgrib/xarray_to_grib.py#L255
However there are few important remarks that define the behaviour of the functionality that writes grib records. In the parameter namespace, writing a variable require knowing the triplet discipline, parameterCategory and parameterNumber. For some variables additional keys need to be specified. Historically Fieldextra kept a dictionary with all variables, mapping the name to the combination of parameter keys that represent in grib that variable: https://github.com/COSMO-ORG/fieldextra/blob/74641a1edf563c5478accbdf003d3aad94010ba1/resources/dictionary_cosmo.txt#L216 Following that approach would require maintaining a similar dictionary (in yaml) for iconarray. The approach of eccodes is though different. They implement the mappings of the various required parameter keys into a unique identifier (paramId) in the local concepts. https://github.com/ecmwf/eccodes/blob/develop/definitions/grib2/localConcepts/edzw/paramId.def Each center can have its own concept.
The function cfgrib.canonical_dataset_to_grib will then require an attribute GRIB_paramId and use the local concepts to define the various parameter grib keys in the record. In order for that to work, GRIB_centre must be set appropriately.
Similarly, for specifying the vertical coordinates, various keys are often required. While Fieldextra encodes those in the same dictionary, eccodes uses another concept (typeOfLevel): https://github.com/ecmwf/eccodes/blob/develop/definitions/grib2/typeOfLevelConcept.def
And cfgrib will then use the key GRIB_typeOfLevel to determine the combination of keys.
The proposal for iconarray is to look at those concepts in eccodes, and make sure that they mimic the same behaviour fieldextra does for COSMO/ICON data. If there would be differences in behaviour we should fix them in the concept for our centre lssw