openclimatefix / satflow

Satellite Optical Flow with machine learning models
https://satflow.readthedocs.io/en/stable/
MIT License
59 stars 11 forks source link

Pretrain on data from Weather4cast #100

Open jacobbieker opened 2 years ago

jacobbieker commented 2 years ago

Detailed Description

https://www.iarai.ac.at/weather4cast/ (on GitHub here: https://github.com/iarai/weather4cast) has some nice weather data from around the world, including cloud masks, all at 4km resolution, so similar to what EUMETSAT gives. This could be useful for pretraining any of the models before finetuning more on our specific data.

Context

Pretraining has been proven to help quite a bit in large models, so this might help there.

Interestingly, they use MSE as their loss function for it all, even though its a video prediction task, and they want the next 8 hours of data predicted. So maybe MSE isn't the worst?

Possible Implementation

Download the data, and run some models on it.

jacobbieker commented 2 years ago

Actually, this is EUMETSAT data! Including some of the optimum cloud masks, and I am assuming the 15 minute full disk images. So for transfer learning this would actually probably be quite useful?

jacobbieker commented 2 years ago

Yeah, that sounds great!

On Tue, Mar 8, 2022, 9:53 AM codeastra2 @.***> wrote:

If no one is working on this, I would like to take this as a first issue!

— Reply to this email directly, view it on GitHub https://github.com/openclimatefix/satflow/issues/100#issuecomment-1061596833, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABWWSN2RK5SHXJWWD6PQPW3U64PSZANCNFSM5FLTVB6A . You are receiving this because you authored the thread.Message ID: @.***>

codeastra2 commented 2 years ago

So to confirm https://github.com/iarai/weather4cast/blob/master/utils/1.%20Onboarding.ipynb these are the steps that need to be done and the h5 files need to be uploaded? If so where should they be uploaded?

jacobbieker commented 2 years ago

Yeah, that would be great! Ideally, they could be uploaded to HuggingFace Datasets so they are easily widely available for anyone to use.