WCRP-CORDEX / cordex-cmip6-cv

Controlled Vocabulary (CV) for use in CORDEX
BSD 3-Clause "New" or "Revised" License
1 stars 7 forks source link

Do we need CORDEX-CMIP6_domain.json? #32

Closed jesusff closed 7 months ago

jesusff commented 1 year ago

This info is already in https://github.com/WCRP-CORDEX/cordex-cmip6-cv/blob/main/CORDEX-CMIP6_domain_id.json

Moreover, if no more info is to be added to this domain_id file, it could be simplified to:

{
  "domain_id": {
    "SAM-50": "South America",
    "CAM-50": "Central America",
    "NAM-50": "North America",
    "EUR-50": "Europe",
    "AFR-50": "Africa",
    "WAS-50": "South Asia",
    "EAS-50": "East Asia",
    "CAS-50": "Central Asia",
    "AUS-50": "Australasia",
    "ANT-50": "Antarctica",
    "ARC-50": "Arctic",
    "MED-50": "Mediterranean",
    "MNA-50": "Middle East and North Africa",
    "MNA-25": "Middle East and North Africa",
    "SAM-12": "South America",
    "CAM-12": "Central America",
    "NAM-12": "North America",
    "EUR-12": "Europe",
    "AFR-12": "Africa",
    "WAS-12": "South Asia",
    "EAS-12": "East Asia",
    "CAS-12": "Central Asia",
    "AUS-12": "Australasia",
    "ANT-12": "Antarctica",
    "ARC-12": "Arctic",
    "MED-12": "Mediterranean",
    "MNA-12": "Middle East and North Africa",
    "GAR-3": "Greater Alpine Region",
    "CEU-3": "Central Europe",
    "SAM-25": "South America",
    "CAM-25": "Central America",
    "NAM-25": "North America",
    "EUR-25": "Europe",
    "AFR-25": "Africa",
    "WAS-25": "South Asia",
    "EAS-25": "East Asia",
    "CAS-25": "Central Asia",
    "AUS-25": "Australasia",
    "SEA-25": "South East Asia",
    "SAM-50i": "South America",
    "CAM-50i": "Central America",
    "NAM-50i": "North America",
    "EUR-50i": "Europe",
    "AFR-50i": "Africa",
    "WAS-50i": "South Asia",
    "EAS-50i": "East Asia",
    "CAS-50i": "Central Asia",
    "AUS-50i": "Australasia",
    "ANT-50i": "Antarctica",
    "ARC-50i": "Arctic",
    "MED-50i": "Mediterranean",
    "MNA-50i": "Middle East and North Africa",
    "MNA-25i": "Middle East and North Africa  high res.",
    "EUR-12i": "Europe high res.",
    "SEA-25i": "South East Asia"
  }

Should we also make the domain names (domain) mention the spatial resolution?

gnikulin commented 1 year ago

We don't need 2 CVs with duplicate information, one is enough.

Currently domain is defined as "name of the CORDEX region" while domain_id as "an identifier assigned to each CORDEX region including a flag for resolution". I would keep domain as it is (geographical name) without information about spatial resolution. In addition, information about native resolution is provided in the native_resolution global attribute.

This CV must include only the 14 continental-scale CORDEX domains (14 domains x 3 resolution and the same for the regular grids with "i"). Other domains as "GAR-3" and "CEU-3" for example are not CORDEX domains but used in some projects, should be excluded.

larsbuntemeyer commented 1 year ago

Yes, correct, the CORDEX-CMIP6_domain.json is duplicated information. This is more or less a result of my discussion to adapt CMIP6 cmor tables. They have more or less key attributes including an institution_id, source_id and experiment_id that are hard coded into the cmor library in the way they are handled as attributes. E.g., if the CV contains an institution_id in the form of, e.g.,

{
  "institution_id": {
    "GERICS": "Climate Service Center"
 }
}

cmor will add an an entry called institution=Climate Service Center automatically. However, this logic does not work in general for other attributes, e.g., a table including the domain_id

{
  "domain_id": {
    "EUR-11": "South America",
  }
}

would not add the domain attribute. It's a little pain since it's difficult to understand the logic in the CMIP CV in general. But since i guess we want to support the use of the cmor library, i would suggest to keep the domain_id table in the form that we have now (since we have domain as a required attribute).

{
    "domain_id": {
        "EUR-11": {
            "domain": "Europe",
            "domain_id": "SAM-50"
        },
    }
}

The same is actually true for the driving_experiment_id, so it's probably the same there (#23). It's a little bit a pain since cmor wasn't really designed with some more general CV in mind.

This CV must include only the 14 continental-scale CORDEX domains (14 domains x 3 resolution and the same for the regular grids with "i"). Other domains as "GAR-3" and "CEU-3" for example are not CORDEX domains but used in some projects, should be excluded.

For me, it's ok the remove non-continental scale domain_ids although i also saw them in some publications. But you are right, it's hard to find an exact definition of those grids.

gnikulin commented 1 year ago

I don't have a strong opinion on the CV format. However, the CMIP6 format may be preferable since it clearly defines the domainattribute and compatible with CMOR.

gnikulin commented 1 year ago

And also our first idea was to use the CMIP6 CVs as much as possible.

jesusff commented 1 year ago

OK, this had some subtle details behind. @larsbuntemeyer, would it make sense to keep in this CV repo simplified json files for these paired attributes and have them automatically converted to the explicit form when inserted automatically into the CMOR table https://github.com/WCRP-CORDEX/cordex-cmip6-cmor-tables/blob/main/Tables/CORDEX_CV.json ?

larsbuntemeyer commented 1 year ago

@jesusff

this had some subtle details behind.

Yes, it's still a pain, but my hope is that this could be resolved in the future...

would it make sense to keep in this CV repo simplified json files for these paired attributes and have them automatically converted to the explicit form when inserted automatically

Yes, that could work and could make sense.

jesusff commented 1 year ago

I would keep domain as it is (geographical name) without information about spatial resolution. In addition, information about native resolution is provided in the native_resolution global attribute.

Depending on the model (see #25) the native_resolution may not match the one in the domain_id. We could think of a building rule for the domain attribute which includes the resolution and the expected projection. E.g. "EUR-12" : "Europe (at roughly 12.5 km grid spacing on a curvilinear projection)" "AFR-12i" : "Africa (on the 0.125-degrees regular latitude-longitude grid)"

gnikulin commented 12 months ago

Depending on the model (see #25) the native_resolution may not match the one in the domain_id.

Yes, that was the main point for introducing native_resolution since resolution in domain_id is the nearest grid spacing in km of the 3 resolutions used in CORDEX-CMIP5 and CORDEX-CMIP6 (50, 25 and 12 km). If a RCM generates simulations at 10 or 15km 12 has to be used as the flag for resolution.

Perhaps, information about resolution can be helpful in domain, there were some questions about different resolution flags. It's very straightforward for the regular grids with "i" (12.5, 25 and 50 are exact resolution in km), can be simply "Africa (12.5 degree regular latitude-longitude grid)". Description for native grids (e.g. rotated, LCC, Mercator) are not exactly the same. The rotated grid for Africa (no rotation) is not curvilinear. The LCC and Mercator projections define resolution exactly in km (e.g. 12.5km) so it's not roughly. The question is how to formulate.

jesusff commented 12 months ago

For the "native" one, we could also fall back to the original suggestion from @larsbuntemeyer of a loose:

"EUR-50": "Europe (low resolution)"
"EUR-25": "Europe (intermediate resolution)"
"EUR-12": "Europe (high resolution)"

and let the crs variable and the native_resolution/native_grid/grid (see #25) attribute provide the details.

gnikulin commented 12 months ago

low, intermediate and high are not so informative without reference, assuming that grids can be different across RCMs may be simply

"EUR-50": "Europe (about 50km resolution)"
"EUR-25": "Europe (about 25km resolution)"
"EUR-12": "Europe (about 12.5km resolution)"

The same template can be used for other grids as for example "ALP-3 (about 3km resolution)"

For the regular grids the description can be more specific as in the above comment: "Africa (12.5 degree regular latitude-longitude grid)"

larsbuntemeyer commented 7 months ago

Right now, we don't indicate any resolution information in the domain attribute. I think, this should go into the free form grid attribute, right? So could we close this (after reformatting)?

gnikulin commented 7 months ago

Yes, the domainattribute only provides geographical names for the 14 CORDEX domains, domain_id is the domain 3-letter acronym with the resolution (approximately) flag and then gridprovides detailed information about native grids, resolution, regridding etc. I hope such a combination should provide all nessesasry information.