openclimatefix / graph_weather

PyTorch implementation of Ryan Keisler's 2022 "Forecasting Global Weather with Graph Neural Networks" paper (https://arxiv.org/abs/2202.07575)
MIT License
196 stars 50 forks source link

[Paper] FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators #17

Open JackKelly opened 2 years ago

JackKelly commented 2 years ago

Another interesting-looking paper!

https://arxiv.org/abs/2202.11214

To quote the abstract:

FourCastNet, short for Fourier Forecasting Neural Network, is a global data-driven weather forecasting model that provides accurate short to medium-range global predictions at 0.25∘ resolution. FourCastNet accurately forecasts high-resolution, fast-timescale variables such as the surface wind speed, precipitation, and atmospheric water vapor. It has important implications for planning wind energy resources, predicting extreme weather events such as tropical cyclones, extra-tropical cyclones, and atmospheric rivers. FourCastNet matches the forecasting accuracy of the ECMWF Integrated Forecasting System (IFS), a state-of-the-art Numerical Weather Prediction (NWP) model, at short lead times for large-scale variables, while outperforming IFS for variables with complex fine-scale structure, including precipitation. FourCastNet generates a week-long forecast in less than 2 seconds, orders of magnitude faster than IFS. The speed of FourCastNet enables the creation of rapid and inexpensive large-ensemble forecasts with thousands of ensemble-members for improving probabilistic forecasting. We discuss how data-driven deep learning models such as FourCastNet are a valuable addition to the meteorology toolkit to aid and augment NWP models.

jacobbieker commented 2 years ago

Just skim-read through this paper, it seems really relevant and has some of the same insights as in the graph paper! As in, the finer grained data, and more variables allows the model to learn more of the physics, and give better results. The code doesn't seem to be public, unfortunately, but the building blocks of the model seem to be. The ERA5 dataset seems like the one that most of these papers tend to use, it gives more years of reforecasts, and better temporal resolution, although both this paper and the graph paper only select every 3rd or 6th hour of data.

Adding the precipitation as a separate model to train on the output of the other variables seems like a cool idea to deal with the problems of most precipitation being zero. They also used a log scale to help with that issue.

The large-scale ensembles exactly track with what we are hoping to do with this model, and I think validates that idea for this too.

Overall, really cool paper! Might be worth trying to reimplement it and see how it compares to the graph model, it seems like this model is slightly slower, and potentially larger? In either case, we have 6-hourly analysis files we could train on which are 0.25 degrees and have around 288 variables, so much larger than either of the inptus to these models, as well as the forecasts which have fewer variables, but more timesteps.

MunumButt commented 2 years ago

This paper has a preliminary release of code here: https://github.com/NVlabs/FourCastNet

Personally I find this paper much less relevant than the authors claim it to be, owing largely due to:

  1. Massive computational requirements making it near impossible to verify results, authors claim 16hr training time across a cluster of 64 A100's. The only way to get access to that kind of hardware would be to be granted some kind of special project access to a HPC centre - and why would anybody waste precious HPC time trying to get poorly (in its current form) maintained code to run.

  2. Relatively small out-of-sample testing period. The authors claim:

"The training dataset consists of data from the year 1979 to 2015 (both included). The validation dataset contains data from the years 2016 and 2017. The out-of-sample testing dataset consists of the years 2018 and beyond."

It is not made explicit as far as I can see exactly what years are included but I would assume it would be 2021 inclusive as that is the last complete year. This seems, in my opinion, too small a period to make any kind of generalised performance claim.

  1. Panels showing FourCastNet prediction maps vs Truth...but no mention of the maps the IFS puts out for the same lead times! Yes there are the ACC and RMSE graphs but those are generalised means - they fail to show localised effects. To account for this the authors do define a metric called the "relative quantile error" for which the IFS performs considerably better but it seems odd they did not want to add the actual forecast maps and instead say it with writing. Going back to the difficulty in verifying the research we simply have to accept what the researchers are claiming is the relative performance.

I am not entirely against research which utilises computational resources beyond the means of most researchers in the field but I am against tiny out of sample validation periods, biased performance reporting and no proper public code releases!