Open rbeucher opened 1 year ago
Hi @rbeucher,
Thanks for your question. Indeed, creating a local pool of ERA5 data is not the easiest part when deploying ESMValTool on a shared machine. Since ESMValTool does not contain a downloader for ERA5 data, one needs to use an external program to get the data. At DKRZ, the pool of ERA5 data (/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/ERA5/v1/
) has been created using both cdsapi and era5cli, afaik. Since I started to maintain this pool, I'm only using era5cli. I don't think data are reformatted in the download. The key to enable usage of ERA5 data in the tool (via the so-called on-the-fly CMORization) is to put the downloaded file (e.g. era5_v_component_of_wind_1990_monthly.nc
) into the right directory tree (e.g. /work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/ERA5/v1/mon/va
). There is no file to do the mapping between CMOR variables and ERA5 ones.
Some clusters, like DKRZ-Levante or CEDA-Jasmin, provide a much larger collection of ERA5 data, sometimes coming from tapes (see discussions in https://github.com/ESMValGroup/ESMValTool/discussions/2183 and https://github.com/ESMValGroup/ESMValCore/issues/1991). We could start to support such large collections in ESMValTool. But we also need to keep in mind that users working on their laptops or small clusters would never have the possibility to create a large collection of ERA5 data locally. That is why we have, up to now, maintained our own collection of ERA5 data at DKRZ. This pool contains all data required to run all recipes in ESMValTool.
Let me know if you have further doubts or questions on this.
Btw, you might want to add an entry for your institution in the config-user.yml file. This should help your users to configure ESMValTool easily when working on your systems.
Thanks @remi-kazeroni . That was my suspicion. I'll work on this and will get back to you.
R
Hi @remi-kazeroni , regarding "We could start to support such large collections in ESMValTool." how much effort would this be do you think? We really do not want to be keeping a second copy of the data at NCI, as we already have a centrally maintained ERA5 replica, we ideally want to be able to CMORise for ESMValTool on the fly if needed. I think NCI would also very much frown upon any additional replication of ERA5 data, maybe it'd be possible for @rbeucher to create a symlink tree with the ESMValTool-compliant directory structure pointing to the ERA5 files?
Yes. My idea is to try a symlink tree.
It is indeed better to make use of already available ERA5 data instead replicated them. An alternative to symlink trees could be to add an entry for NCI in the config-developer.yml file for native6 here to reflect the directory structure of your ERA5 data pool, if that makes sense in your case.
The pb with this is that ERA5 don't use CMIP vocabulary. So it's hard to map the variables. Or am I missing something?
BTW , what do you do with derived variables? Do you add a new Netcdf file to the pool after calculating the values?
Any idea where I can find a mapping between CMIP variable names and ERA5 variable names?
OK I have made some progress. The symlink tree does work and is not too hard to set up. I have mapped 90% of the variables. Still a few issues. I will document the process and share a link for reference.
BTW , what do you do with derived variables? Do you add a new Netcdf file to the pool after calculating the values?
If both ERA5 variables needed for derivation are defined in CMOR tables, you could use derive: true
in the recipe and derive the variables on-the-fly. Afaik, we don't store derived ERA5 variables in our DKRZ data pool.
Any idea where I can find a mapping between CMIP variable names and ERA5 variable names?
I wish I could answer that... For ERA5 variables supported in ESMValTool, you can take a look at the example recipe creating daily data in which you can see the mapping with the era5_name
keys.
I will document the process and share a link for reference.
That'd be great, thanks!
Thank you @remi-kazeroni . Yes I have found the era5 recipe example very useful to do the mapping.
Hi All,
We (ACCESS-NRI) are trying to set up the ESMValTool recipes on our system at NCI. There are quite a few recipes using the OBS6 ERA5 data which, from what I understand, are cmorised ERA5 data from the native6 raw data you have available on Mistral (#2396).
What confuses me is the name of the variables used in the cmoriser recipes, which are CMIP variable names. I was wondering how the native6 raw data were organised on Mistral? @remi-kazeroni @alistairsellar How do you use ERA5 on your system?
Does native6 actually contain native ERA5 data? or are they somehow already reformatted version of the native ERA5? Are you using symlinks to the native ERA5 files?
The ERA5 collection is huge, as you know, so I ideally we would like to use what is already available at NCI:
Any help is appreciated!