Open peterdudfield opened 5 months ago
Processed data is currently being stored here: https://huggingface.co/datasets/openclimatefix/eumetsat-iodc although it will be eventually collated and pushed to the GCP Public Dataset
Do what years fo data have been collected?
No, its currently filling in random timesteps from the whole archive, 2017-now. Once dagster has its external assets tracker thing, it'll be able to know what's on HF and we can more systematically fill in the dataset. It has some data from all years though.
Currently at 16,000/~175,000.
Could speed it up by removing the europe 15 minutes. 2017 to 2024. Shrink this down to doing 2019 to 2024.
Seems the vast majority of this is now available on huggingface: https://huggingface.co/datasets/openclimatefix/eumetsat-iodc. This might be enough to mark the task as done @peterdudfield?
Also, for future reference as part of the handover from Jacob I've been instructed to now include this as part of the Google Public Datasets -hosted dataset, as opposed to updating the huggingface-hosted dataset. This is because GCP has faster reads for Zarr and can handle non-zipped yearly Zarr folders as opposed to zipped zarrs per timestep. See https://github.com/openclimatefix/Satip/issues/223
The long range forecast for RUVNL, as well as requiring extended ECMWF data (https://github.com/openclimatefix/nwp-consumer/issues/138) and a new Meteomatics archive, necessitates the collection of Satellite data for India.
[Part of RUVNL]