Open remi-kazeroni opened 1 year ago
Here is the status of our efforts to contact data providers of our Tier3 datasets to check if those could be made Tier2:
Tier3 dataset | Provider contacted | Answer | Remarks | Moved to Tier2 |
---|---|---|---|---|
APHRO-MA | Yes | N/A | looks possible based on their website | - |
AURA-TES | Yes | N/A | - | - |
CALIPSO-ICECLOUD | Yes | Yes | - | - |
CDS-SATELLITE-ALBEDO | No | - | Licenses to be checked for CDS datasets | - |
CDS-SATELLITE-LAI-FAPAR | No | - | - | - |
CDS-SATELLITE-SOIL-MOISTURE | No | - | - | - |
CDS-UERRA | No | - | - | - |
CDS-XCH4 | No | - | - | - |
CDS-XCO2 | No | - | - | - |
CERES-SYN1deg | No | - | - | - |
CLARA-AVHRR | No | - | - | - |
CLOUDSAT-L2 | No | - | - | - |
ERA-Interim | No | - | Data license should allow it | - |
ERA-Interim-Land | No | - | Data license should allow it | - |
ERA5 | No | - | Data license should allow it (done by DKRZ, CEDA and other HPCs) | - |
ESACCI-WATERVAPOUR | No | - | only preliminary version currently supported | - |
FLUXCOM | Yes | Yes | - | - |
GRACE | No | - | - | - |
HWSD | No | - | - | - |
JMA-TRANSCOM | No | - | - | - |
LAI3g | No | - | - | - |
LandFlux-EVAL | No | - | - | - |
MAC-LWP | Yes | Yes | - | - |
MERRA2 | Yes | Yes | - | - |
MERRA | No | - | Same answer as for MERRA2? | - |
MLS-AURA | Yes | Yes | - | - |
MODIS | No | - | - | - |
MTE | Yes | Yes | - | - |
NDP | No | - | - | - |
NIWA-BS | Yes | Yes | - | - |
NSIDC-0116-* | No | - | - | - |
UWisc | Yes | Yes | predecessor of MAC-LWP | - |
(in bold font: datasets that could be moved to Tier2 right away)
Attention: @rswamina. This is related to our ongoing discussion on observational datasets in ESMValTool (https://github.com/ESMValGroup/Community/discussions/70)
I was thinking about some of the redistribution implications of this. Two thoughts in particular:
There is a number of Tier 3 datasets (see status in the table below) for which we (@hb326 and myself) got the approval from data providers to allow access to our users on shared machines (e.g. DKRZ, Jasmin, ...). At the moment, CMORized data stored on Levante are restricted to group members (i.e. core developers) if Tier 3 and with read access to anyone if Tier 2. To ease data access for our users, we first need to check with data providers if more Tier 3 datasets could be labelled Tier 2 instead. This would allow us to open up the access at DKRZ and include more datasets into the synchronization pipeline between main HPC centers where ESMValTool is developed (see #2630 for separate discussion).
Changing the Tier of a dataset may lead to various backward incompatibility issues. Recipes using that dataset will need to be updated. Users having their local pool of CMORized data will also need to move some datasets to Tier2 on their own. A possibility to circumvent that could be to make the
tier
key in recipes optional. See https://github.com/ESMValGroup/ESMValCore/issues/2112 for separate discussion.On the Tool side, a number of things will need to be done when moving a dataset from Tier3 to Tier2. Here is a tentative checklist:
tier
attribute in the metadata).tier
key for affected datasetsOBS/Tier2
(specific to DKRZ)OBS/Tier3
to avoid breaking recipes under development (specific to DKRZ)OBS/Tier2
data are mirrored (specific to DKRZ)RAWOBS/Tier2
(specific to DKRZ)