Open mgrover1 opened 3 years ago
Thanks for posting Max. Via email you mentioned that you would be ready to start on this in late summer. I think that aligns well with our timeline.
Here is the repository which hosts the code which was used to zarrify the CESM-LE dataset which is AWS, as well as a few other NCAR related datasets
https://github.com/NCAR/data-zarrification
The different workflows are in the notebooks directory
We're already off to the races in #53, but for future reference just dropping this additional context document here, in case it's useful: https://hackmd.io/L_JRtOExSaKP7xg-DKeU5w
Also, here is the drafted documentation JupyterBook for this project https://ncar.github.io/cesm2-le-aws/model_documentation.html
CESM2-LE Dataset
The CESM2 Large Ensemble was generated in partnership with the IBS Center for Climate Physics in South Korea. When completed, the CESM2 Large Ensemble will consist of 100 members at 1 degree spatial resolution covering the period 1850-2100 under CMIP6 historical and SSP370 future radiative forcing scenarios. Data sets from this ensemble will be made available via the Climate Data Gateway on June 14th, 2021, with the data stored on the GLADE file system at NCAR
We are still waiting for the files to CMOR-ized, before uploading, but we can test this workflow on the non-CMORized datasets currently available.
Transformation / Alignment / Merging
These files should be combined into similar ensemble members (Ex. 1-10, 11-80, etc.) and by their respective variables and frequency, chunked in time. An example would be the CESM-LENS dataset
Here is an example of the potential experiment groupings
Output Dataset
The output dataset should be
zarr
format