pangeo-forge / staged-recipes

A place to submit pangeo-forge recipes before they become fully fledged pangeo-forge feedstocks
https://pangeo-forge.readthedocs.io/en/latest/
Apache License 2.0
39 stars 63 forks source link

Example pipeline for CM2.6 #2

Open rabernat opened 4 years ago

rabernat commented 4 years ago
## Source Dataset

CM2.6 is a high-resolution global climate model run by GFDL. There are two scenarios: a preindustrial control and a 1% CO2 increase. We already have some CM2.6 data in google cloud: https://catalog.pangeo.io/browse/master/ocean/GFDL_CM2_6/ I created it manually.

Transformation / Alignment / Merging

In general, we want to concatenate the files in time. However, different variables in different files have different time resolutions (monthly, 5-day, daily).

Getting the files to concatenate cleanly required some manual tweaks (dropping variables and overwriting coordinates). There are weird glitches and inconsistencies between different files from the same output set. Some workflows are documented in this repo.

Output Dataset

I think we would like one zarr dataset for all variables with the same grid and temporal resolution. Chunked in time. For 3D data, we also need to chunk in space, probably the vertical dimension makes most sense.

rabernat commented 3 years ago

This probably requires https://github.com/pangeo-forge/pangeo-forge/issues/93.