WCRP-CORDEX / cordex-cmip6-cv

Controlled Vocabulary (CV) for use in CORDEX
BSD 3-Clause "New" or "Revised" License
1 stars 6 forks source link

Do we need `CORDEX-CMIP6_native_resolution.json` #25

Closed gnikulin closed 6 months ago

gnikulin commented 11 months ago

native_resolution global attribute is defined as "free form" and can also include more detailed description of unstructured grids fro example. In this case we don't need CORDEX-CMIP6_native_resolution.json ?

jesusff commented 11 months ago

I'd say only those defined as CV require an enumeration of the possible values in the json files. Another matter would be to switch it to CV, as there might be only a few options in the end. This element could be added to the source_id registration and the groups suggest additional values if they are not covered. In this way, the global attribute in the files will be more consistent.

gnikulin commented 11 months ago

In most cases for the continental-scale CORDEX domains values will be exactly the same as now in the CV

{
    "native_resolution": [
        "12.5 km",
        "25 km",
        "50 km",
        "0.11 degree",
        "0.22 degree",
        "0.44 degree"
    ]
}

I don't know about native resolution for unstructured grids that will be remapped to one of the CORDEX common grids. Can it be just one value or there is a need in a longer description ?

Additional values can be added later of course when necessary, for example if a group, by any reason, runs simulations at 20 or 10 km.

csteger commented 11 months ago

I think field for the native resolution would be good for unstructured grids like in ICON. We provide the data on the normal 12.5 km EUR-11 grid in the end, but the data user has no possibilty to figure out (from the files) what the resolution of the model was. We could put the ICON grid discription in the field ("native_resolution": ["R13B05"]) or the mesh size of this grid (~12.1 km). Is there already another attribute about the general type of the grid (e.g. "unstructured"). If not we could also have something like "Unstructured ICON grid R13B05).

gnikulin commented 11 months ago

Currently, there is no global attribute describing the general type of the grid. In general, this info is available from coordinate variable crs describing the coordinate reference system but this is not a case for unstructured grids. Th simplest way is to provide this info in the native_resolution attribute. In addition, I've got a comment that native resolution (grid information) for ocean models should be also provided and the native_resolution attribute as free text can be used.

larsbuntemeyer commented 11 months ago

I'd say only those defined as CV require an enumeration of the possible values in the json files. Another matter would be to switch it to CV, as there might be only a few options in the end. This element could be added to the source_id registration and the groups suggest additional values if they are not covered. In this way, the global attribute in the files will be more consistent.

I totally agree, if we keep native_resolution as a required attribute, it should be controlled, e.g., not any entry should be allowed and it should be a derived attribute from domain_id. I guess we need to decide, if this native_resolution describes the actual grid on which the data is published.

@csteger

Is there already another attribute about the general type of the grid (e.g. "unstructured")

I guess more detailled model specific grid information is always welcome although for now i would suggest to make it part of the source_id usiing the source attribute, like, e.g, here https://github.com/PCMDI/cmip6-cmor-tables/blob/21f9732c22d226818c1a760570ac8955bc5eabb0/Tables/CMIP6_CV.json#L2780

jesusff commented 11 months ago

I guess we need to decide, if this native_resolution describes the actual grid on which the data is published.

According to the specs (SOD) native_resolution provides information about resolution of native model grids in km or deg or more detailed description of unstructured grids. So this is the computational grid of the model, not the resolution of the grid on which the data is published. This distinction is important for the case raised by @csteger , where the model computational grid is not directly usable by standard tools and will need postprocessing to more standard grids. The resolution of the grid in which the data are provided is included (roughly) in the domain_id, and it could be explicitly mentioned in the domain attribute to make it unique for each domain_id, but I continue this in #32 , where it fits better.

Christian, is this R13B05 some standard grid description for unstructured grids? or is it really an ICON naming that cannot be generalized to other models?

csteger commented 11 months ago

The R13B05 is the nomenclature of ICON. The formula for distane of the grid centers is ∆x ≈ 5050/(n 2k) [km], where n and k are the coefficients of the grid (RnBk). For EURO-CORDEX we will use a R13B05 gird. Translate to an effective mesh size of 12.13942 km. In the postprocessing we interpolate the data to the "normal" EURO-CORDEX grid. But this cannot be used for other unstructrued grids (only for the ones derived from an Icosahedron). But I would assume that other grids have a similar nomenclature.

If the attribtue "native resolution" is already forseen for this, we can just go with this attribute. We should maybe rename it to "native resolution atmosphere" to avoid overlaps with ocean grids in coupled models.

gnikulin commented 11 months ago

I totally agree, if we keep native_resolution as a required attribute, it should be controlled, e.g., not any entry should be allowed and it should be a derived attribute from domain_id. I guess we need to decide, if this native_resolution describes the actual grid on which the data is published.

Even a required attribute can be "free form" as for example gridin CMIP6: "The “grid” global attribute can be used to describe the horizontal grid and regridding procedure. There is no standard form used to record this information, but it is suggested that when appropriate the following be indicated: brief description of native grid and resolution, and if data have been regridded, regridding procedure and description of target grid. "

As I mentioned native_resolutionmay include information about different RCM components: atmosphere, ocean, etc.

jesusff commented 11 months ago

OK, then in CMIP6 there is grid for a free form text to describe the grid actually provided in the file and how the native grid was transformed, if so, to get to it. Then, there is a grid_label as CV to have a loose indication of whether one is dealing with the native grid or some regridding.

In CORDEX, this native vs regridded info is in the domain_id. In this case, the "i" (e.g. "AUS-25i") indicates that the grid is interpolated, while the lack of "i" indicates the native grid is provided. However, there are exceptions to both cases. Native grids are sometimes not provided (e.g. in global stretched grid models) and the non-i domain_id is interpolated in this case. And the native grid can also be regular (non-rotated pole) in lon-lat (e.g. AFR-25) so the "i" version would not be really interpolated (should it be provided as AFR-25 or AFR-25i?)

However, the number in the CORDEX domain_id indicates only loosely the grid spacing. It is similar to the nominal_resolution in CMIP6, which falls to the closest from a limited number of preset values regardless of the "real" resolution. In this sense, we still need to define what is the exact grid in the file, but this is in the crs variable, that includes all details of the projection.

In summary, I would keep this attribute and maybe call it native_grid, instead of resolution, given that more than the resolution needs to be explained. Or just grid, as this is essentially the same as in CMIP6. We can leave it as free text, but control a bit the formatting of similar grids during registration. Some examples:

grid = "Lambert conic conformal with 25 km grid spacing"
grid = "Rotated-pole latitude-longitude with 0.22 degree grid spacing"
grid = "Rotated-pole latitude-longitude with 0.11 degree grid spacing, interpolated by 2nd order conservative remapping from the original unstructured icosahedral ICON grid R13B05 (~12.1 km)"
grid = "Rotated-pole latitude-longitude with 0.22 degree grid spacing; ocean grid Mediterranean Sea only, 9-12 km with a tilted and stretched grid at the Gibraltar Strait"

Regarding the ocean grid, I'm not sure if the concern is with providing this info as supplementary information (as the example above, if attached to an atmospheric variable), to highlight that this output comes from a coupled model; or rather that, in ocean variables, the output will be provided as e.g. MED-12 but the ocean model is typically run at a higher resolution. For this case, the idea would be the same as for the unstructured grid above:

grid = "Lambert conic conformal with 12 km grid spacing, interpolated by 1st order conservative remapping from the original 5-9 km stretched grid"
gnikulin commented 11 months ago

Somehow I completely missed grid in CMIP6 and indeed it fits better for describing both native and regular grids and resolution. We can skip native_resolution and use only grid as free text providing several examples.

We have a number of possible combinations:

ARCM: grid = "Atmos grid"

AGCM: grid = "Atmos grid, remapped from an unstructured grid ..."

AORCM, atmospheric variables grid = "Atmos grid, Ocean grid"

AORCM, ocean variables remapped to atmos grid' grid = "Atmos grid, remapped from Ocean grid"

AORCM, ocean variables provided on the native ocean grid' grid = "Ocean grid, Atmos grid"

Th first grid is the grid in the file while others are grids for different RCM components. Not sure about domain_id for ocean grids.

May we also expect 3 or 4 different grids for different RCM components ? Or only to consider atmos and ocean here ?

gnikulin commented 11 months ago

And the native grid can also be regular (non-rotated pole) in lon-lat (e.g. AFR-25) so the "i" version would not be really interpolated (should it be provided as AFR-25 or AFR-25i?)

the "i" version defines a number of prescribed grids. AFR-25i is the 0.25deg regular lon-lat grid while a simulation for AFR-25 on the rotated grid has resolution of 0.22deg so interpolation is still nessesasry.

jesusff commented 11 months ago

I paste here a personal communication with Marcus Thatcher to keep the description of another grid (C384) that will contribute data the CORDEX-CMIP6:

In CCAM's case we have a fx output variable called "grid" which basically indicates the underlying grid resolution at each interpolated grid box. But there is probably a more efficient way to represent that information.

C384 is the designation of a cubic grid. CCAM was one of the first, but there are others of course like the GFDL's FV3. C384 means there are 384 x 384 grid points on each of the six cubic panels (384 x 384 x 6 horizontal grid points in total). We then use a Schmidt coordinate transform where S=2 means the grid resolution is approximately halved in the front panel at the expense of doubling the resolution on the rear panel (Schmidt transform preserves orthogonality, etc). But that approach would not help something like MPAS which has more of an unstructured grid that is optimised around a density function prescribed by the user (at least as I understand it). Maybe including an output variable with the underlying grid resolution would also work in that case?

gnikulin commented 10 months ago

I think we agree to use grid and skip native_resolution ? It's up modeling groups how describe grids and we can provide several examples and some recommendations.

gnikulin commented 10 months ago

We may think about a new variable which indicates the underlying unstructured grid resolution at each interpolated grid box. I think there is nothing similar in CMIP. Another question is how useful this variable can be.