google-research / arco-era5

Recipes for reproducing Analysis-Ready & Cloud Optimized (ARCO) ERA5 datasets.
https://cloud.google.com/storage/docs/public-datasets/era5
Apache License 2.0
301 stars 23 forks source link

Switch ARCO-ERA5 to update daily with a ~5 day delay #82

Open shoyer opened 2 months ago

shoyer commented 2 months ago

Currently, ARCO-ERA5 is updated monthly with a 3 month delay. This matches when the "final" version of ERA5 is available, ~2 months delayed.

However, there is also a preliminary version of ERA5 called ERA5T, which is updated daily with a ~5 day delay. In theory, ERA5T could look different from the final version of ERA5, but according to ECMWF this has only happened once.

In practice, I think we could safely update ARCO-ERA5 to update much more frequently, which would be quite useful for downstream operations. Let's look into this after solving the current race condition involved with resizing datasets: https://github.com/google-research/arco-era5/issues/81

dgilford commented 2 months ago

Quick note to check my own understanding. Which specific files are updated monthly with that delay?

shoyer commented 2 months ago

I think all of the Zarr files listed in the README should be updated with that cadence.

sahildabhi0101 commented 1 month ago

@shoyer @dgilford One small correction: All of the zarr file which mentioned into the readme.md except this 'gs://gcp-public-data-arco-era5/ar/model-level-1h-0p25deg.zarr-v1' are updated on monthly cadence.