I have been using the latlon dataset here gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3. It has been extremely helpful for setting up different projects. I am wondering if it would be possible to rechunk the pressure level data. Currently all pressure levels are in a single chunk. If we want to sub sample we will end up getting the entire chunk which can significantly slow down the bandwidth. Ideally given this is in object storage we could use much smaller chunk sizes and just have the chunks be the lat long grid. What do you thinks?
This is a lot of data, so I don't think we're going to store another duplicate version of this dataset. But there are a number of tools for rechunking the data yourself, e.g., see rechunker or xarray-beam
I have been using the latlon dataset here
gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3
. It has been extremely helpful for setting up different projects. I am wondering if it would be possible to rechunk the pressure level data. Currently all pressure levels are in a single chunk. If we want to sub sample we will end up getting the entire chunk which can significantly slow down the bandwidth. Ideally given this is in object storage we could use much smaller chunk sizes and just have the chunks be the lat long grid. What do you thinks?