deepclimatenyc / seminar

Notes for the weekly seminar on deep learning for climate modeling
https://columbiauniversity.zoom.us/j/389986495
2 stars 6 forks source link

QG Model Data Online #8

Open rabernat opened 5 years ago

rabernat commented 5 years ago

As promised, I have outputted some datasets from our quasigeostrophic model pyqg. I have placed these online in google cloud storage in zarr format. To access these data, you need to have the following python packages installed.

conda install -c conda-forge xarray dask zarr gcsfs

To open the dataset, you do something like this

import xarray as xr
import gcsfs
# if you get an error the first time, just run the next line again
# this is a weird gcsfs bug (https://github.com/dask/gcsfs/issues/117)
ds = xr.open_zarr(gcsfs.GCSMap('pangeo-data/pyqg/barotropic/beta_00.zarr'))
ds

This will "lazily" load a dataset with the following structure

<xarray.Dataset>
Dimensions:  (time: 200, x: 512, y: 512)
Coordinates:
  * time     (time) float64 0.2 0.4 0.6 0.8 1.0 1.2 ... 39.2 39.4 39.6 39.8 40.0
  * x        (x) float64 0.006136 0.01841 0.03068 0.04295 ... 6.253 6.265 6.277
  * y        (y) float64 0.006136 0.01841 0.03068 0.04295 ... 6.253 6.265 6.277
Data variables:
    psi      (time, y, x) float32 dask.array<shape=(200, 512, 512), chunksize=(10, 512, 512)>
    q        (time, y, x) float32 dask.array<shape=(200, 512, 512), chunksize=(10, 512, 512)>
    u        (time, y, x) float32 dask.array<shape=(200, 512, 512), chunksize=(10, 512, 512)>
    v        (time, y, x) float32 dask.array<shape=(200, 512, 512), chunksize=(10, 512, 512)>

The data is downloaded when its actually needed, i.e. for plotting or feeding to ML training.

There are 10 different datasets, beta_00.zarr -- beta_09.zarr corresponding to different versions of the beta parameter.

I have several ideas for ML projects with this data, including:

rabernat commented 5 years ago

Here's what I put together so far:

https://gist.github.com/rabernat/0e0f8e1faec22106dd450f3d74b58654

rabernat commented 5 years ago

You will also need xbatcher

https://xbatcher.readthedocs.io/en/latest/?badge=latest

pip install git+https://github.com/rabernat/xbatcher.git