ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
224 stars 128 forks source link

Moving datasets from Tier 3 to Tier 2 when approved by providers #3331

Open remi-kazeroni opened 1 year ago

remi-kazeroni commented 1 year ago

There is a number of Tier 3 datasets (see status in the table below) for which we (@hb326 and myself) got the approval from data providers to allow access to our users on shared machines (e.g. DKRZ, Jasmin, ...). At the moment, CMORized data stored on Levante are restricted to group members (i.e. core developers) if Tier 3 and with read access to anyone if Tier 2. To ease data access for our users, we first need to check with data providers if more Tier 3 datasets could be labelled Tier 2 instead. This would allow us to open up the access at DKRZ and include more datasets into the synchronization pipeline between main HPC centers where ESMValTool is developed (see #2630 for separate discussion).

Changing the Tier of a dataset may lead to various backward incompatibility issues. Recipes using that dataset will need to be updated. Users having their local pool of CMORized data will also need to move some datasets to Tier2 on their own. A possibility to circumvent that could be to make the tier key in recipes optional. See https://github.com/ESMValGroup/ESMValCore/issues/2112 for separate discussion.

On the Tool side, a number of things will need to be done when moving a dataset from Tier3 to Tier2. Here is a tentative checklist:

remi-kazeroni commented 1 year ago

Here is the status of our efforts to contact data providers of our Tier3 datasets to check if those could be made Tier2:

Tier3 dataset Provider contacted Answer Remarks Moved to Tier2
APHRO-MA Yes N/A looks possible based on their website -
AURA-TES Yes N/A - -
CALIPSO-ICECLOUD Yes Yes - -
CDS-SATELLITE-ALBEDO No - Licenses to be checked for CDS datasets -
CDS-SATELLITE-LAI-FAPAR No - - -
CDS-SATELLITE-SOIL-MOISTURE No - - -
CDS-UERRA No - - -
CDS-XCH4 No - - -
CDS-XCO2 No - - -
CERES-SYN1deg No - - -
CLARA-AVHRR No - - -
CLOUDSAT-L2 No - - -
ERA-Interim No - Data license should allow it -
ERA-Interim-Land No - Data license should allow it -
ERA5 No - Data license should allow it (done by DKRZ, CEDA and other HPCs) -
ESACCI-WATERVAPOUR No - only preliminary version currently supported -
FLUXCOM Yes Yes - -
GRACE No - - -
HWSD No - - -
JMA-TRANSCOM No - - -
LAI3g No - - -
LandFlux-EVAL No - - -
MAC-LWP Yes Yes - -
MERRA2 Yes Yes - -
MERRA No - Same answer as for MERRA2? -
MLS-AURA Yes Yes - -
MODIS No - - -
MTE Yes Yes - -
NDP No - - -
NIWA-BS Yes Yes - -
NSIDC-0116-* No - - -
UWisc Yes Yes predecessor of MAC-LWP -

(in bold font: datasets that could be moved to Tier2 right away)

remi-kazeroni commented 1 year ago

Attention: @rswamina. This is related to our ongoing discussion on observational datasets in ESMValTool (https://github.com/ESMValGroup/Community/discussions/70)

alistairsellar commented 7 months ago

I was thinking about some of the redistribution implications of this. Two thoughts in particular: