openclimatefix / graph_weather

PyTorch implementation of Ryan Keisler's 2022 "Forecasting Global Weather with Graph Neural Networks" paper (https://arxiv.org/abs/2202.07575)
MIT License
199 stars 51 forks source link

Perform Data Assimilation with model #6

Open jacobbieker opened 2 years ago

jacobbieker commented 2 years ago

Detailed Description

One of the most compute intensive steps in NWPs is doing the data assimilation to create the starting conditions for running the simulation. This type of model could theoretically speed that up if it can integrate lots of observations and essentially interpolate them to the regular grid for NWP initialization.

Context

Possible Implementation

NOAA makes the raw observations public as well as the analysis files which are what the NWP's are started from.

JackKelly commented 2 years ago

Sounds good!

I'm really out of my depth here but I'm dimly aware that this might be quite an active area of research in academia. If you haven't done so already, it might be worth doing a quick literature search on the topic of using ML for data assimilation. My understanding is that this is currently unsolved, although I should emphasise that I'm really out of my depth!

JackKelly commented 2 years ago

Although I think the reason it's largely unsolved is perhaps because NWP models absolutely require the output of data assimilation to be physically coherent (e.g. strictly observe conservation of energy and water over time). So it's really hard to use ML for data assimilation if you're feeding the output of the data assimilation into an NWP model (because the NWP model will blow up if the output from the ML data assimilation model isn't strictly physically coherent. And ML models are notoriously bad at strictly adhering to physical laws). But, if you're feeding the output of the data assimilation into another ML model, then maybe it'll work! (Because the "forecasting" ML model doesn't need the input data to be strictly physically coherent).

jacobbieker commented 2 years ago

Sounds good!

I'm really out of my depth here but I'm dimly aware that this might be quite an active area of research in academia. If you haven't done so already, it might be worth doing a quick literature search on the topic of using ML for data assimilation. My understanding is that this is currently unsolved, although I should emphasise that I'm really out of my depth!

Yeah, that will be my plan before I start on it! This issue is fairly low on the priority list, definitely want to get forecasting the next steps of the NWPs going first. And yeah, makes sense that it might not work as assimilation for NWPs, but I am hoping that we could then feed it into the forecasting network and get reasonable forecasts that way much faster than how long it currently takes.

jacobbieker commented 2 years ago

From the ECMWF Conference, there are apparently a lot more issues than generally just taking the observations and doing assimilation, one example here: https://github.com/DL-WG/LatentAssimilation but might need to be more careful or add more physical constraints/etc. Data assimilation also currently apparently doesn't use a lot of the data available, especially from satellite observations, so adding those could be quite useful.