Open martinjuckes opened 2 months ago
Just tagging that the https://github.com/PCMDI/cmip6-cmor-tables/ (CMIP6) and the https://github.com/PCMDI/mip-cmor-tables/ (CMIP6Plus) are relevant references for the discussion above
It's useful to put the logic in the same place so we can have a meaningful discussion.
Just for context, this is what the evolution looks like, total number of tables per mip_era, cmip3-cmor-tables = 6, cmip5-cmor-tables = 18, cmip6-cmor-tables = 43 and CMIP6Plus/mip-cmor-tables = 81, remapping all entries (including CMIP6 12 E* tables) into 9 identifiers AC = Atmosphere chemistry, AE = Atmosphere Aerosol, AP = Atmosphere Physics, GI = Greenland Icesheet, LI = Land Ice, LP = Land Physics, OB = Ocean Biogeochemistry, OP = Ocean Physics, SI = Sea Ice.
For historical context, the table evolution had originally been targeting the model components (or history files) that the data would be written from, this evolved a lot in CMIP5 and subsequent phases, with the CF tables being merged into CMIP5 (after having been generated in parallel to the CMIP3 tables, the cfmip1-cmor-tables).
To further these discussions I wonder whether knowing how many variables, and duplicates exist across these tables could be a good idea? In the variable registry discussions that we've been having for months, this was the intended next steps, which I do believe @wolfiex had started - not sure where we're up to.
ping @matthew-mizielinski @taylor13
regarding duplication of variables: Of the 2062 CMIP6 variables, about 25% differ only by the sampling interval ("frequency") or the region (Greenland, Antarctica, global) they were requested at. All the variable attributes and all the CF-convention global attributes are the same across duplicate variables.
@taylor13 , @durack1 :
Some background on the list of identifiers that I've suggested above:
E....
table identifiers. These were introduced for technical reasons in CMIP6 and are not needed. Variables can use existing identifiers such as Amon
etc.Amon
, Omon
, CFmon
and many others. These well established names help a broad user base with to engage with the request without navigating technical details.Ofx
, Efx
, AERfx
: these are all about the model configuration and there is no need for separate identifiers.Rationalise use of modifiers Clim
, Pt
, etc.
The main differences from the larger list proposed in CMIP6Plus are:
Amon
, Omon
to avoid making changes to well used variables. Data Request WIP meeting on Oct 1st 2024 concluded with a decision to stay with the CMIP6 table names.
Meeting notes (restricted access): here
The CMIP6 Plus approach gives a lot of improvements over CMIP6, but would also create a disruption in the naming of the most widely used CMIP variables which are generally in tables which have had stable names for multiple CMIP eras.
The CMIP6 Plus table list could be modified to reduce the disruption caused by unneeded changes to long-standing terms. This would involve replacing “AP...” with “A”, “OP...” with “O”, and removing the realm prefix from sub-daily and fixed fields. The “GI....” and “AI....” tables are also redundant in the CMIP AR7 Fast track request as the region specification has been moved from being implicit in the table name to being a separate element in a the variable group specification. Retain “CF..” tables which are widely used and omit the “AC..” tables which have minimal content, and the “AE..” tables which have taken some of the “CF...” content.
· CFday, CFmon, CFmonLev, CFmonZ, CFsubhrPt, CFsubhrPtSite, · 1hr, 1hrPt, 3hr, 3hrPt, 3hrPtLev, 6hr, 6hrPt, 6hrPtLev, 6hrPtZ, fx · Aday, AdayLev, AdayZ, Amon, AmonClim, AmonClimLev, AmonDiurnal, AmonLev, AmonZ, AsubhrPt,
· LIday, LImon, · Lday, Lmon, Lyr, LyrPt, · Oday, Odec, OdecLev, OdecZ, Omon, OmonClim, OmonClimLev, OmonLev, OmonZ, Oyr, OyrLev, · SIday, SImon, SImonPt.
This still results in an increase in the number of MIP tables (46) compared to CMIP6, but considerably less than in CMIP6 Plus, while still retaining a more systematic approach than CMIP6. It also retains continuity with CMIP6 and earlier CMIP phases for the most widely used variables.