openclimatefix / nwp

Tools for downloading and processing numerical weather predictions
MIT License
9 stars 3 forks source link

Add archiving script for ICON Model #17

Closed jacobbieker closed 1 year ago

jacobbieker commented 1 year ago

DWD's ICON model doesn't seem to have an archive that I can find. We can get the live forecast, but would need to archive it ourselves. If we want to train on it in the future, then it seems good to start trying to archive it as soon as we can.

Detailed Description

The plan would be to store in Hugging Face so that its accessible to all (and free to store) in Zarr format. While the forecast goes out to 120 hours, to cut down on size, we would only archive the first 72 hours, which has a 1 hourly resolution.

Context

We want to try ICON for Europe and global forecasts since it is supposedly better than GFS, also free to use, and at a higher temporal and spatial resolution.

Possible Implementation

Base the script off of https://github.com/guidocioni/icon_forecasts

And for Global ICON: https://github.com/guidocioni/icon_globe

@devsjc any thoughts on how to do this? Would want to start it as soon as possible. I have other scripts, like the backing up of the CAMS forecast running, just to archive until things can be made more formal.

devsjc commented 1 year ago

Hi Jacob, sorry only just getting round to my mentions!

I can see that there is a beginnings of both a global and an EU dataset on Huggingface https://huggingface.co/datasets/openclimatefix/dwd-icon-eu/tree/main, how did you end up implementing this?

jacobbieker commented 1 year ago

Hi,

I did end up using those linked repos above to make the one here which runs as a cron job on both donatello and leonardo to make sure each run should be synced up to huggingface. There seems to sometimes be ones it misses, but its been pretty consistent so far.

jacobbieker commented 1 year ago

Although, it does sometimes still fail, as in how it hasn't updated in the last day for some reason now, so a more robust way of doing it would probably be better