pangeo-data / WeatherBench

A benchmark dataset for data-driven weather forecasting
MIT License
691 stars 166 forks source link

What's the performance level of ACNet? #34

Closed mountain closed 3 years ago

mountain commented 3 years ago

Hi, Rasp @raspstephan ,

I had read the recent paper on PRL - Enforcing Analytic Constraints in Neural Networks Emulating Physical Systems , did you tried them on WeatherBench? And how about the performance level on ACNet? I had skimmed the paper, and only found performance by MSE on dimensionless variables? I concern the performance level by standards here, ie., 3-day and 5-day prediction error on t850 and z500?

Thanks in advance.

raspstephan commented 3 years ago

Hi, it's a very interesting thought to bring together the two papers. However, it's not that easy. The PRL paper requires one to be able to write down a closed set of equations that conserves energy, moisture, mass. Then one can enforce those constraints. In the WeatherBench setup this would probably be very difficult to do but maybe not impossible.

Specifically, you would need to figure out how to compute the total energy, moisture and mass of the atmospheric state from the WeatherBench variables (so static energy, kinetic energy, etc.). Most likely it won't be possible to do this exactly because the ocean and land interactions are not taken into account and we only have a coarse vertical grid. However, it might still be a good first order approximation. Then you would also have to predict all of the variables required to compute the total energy, moisture and mass. Then it would be possible to add a constraint. I don't really know how much this would help but maybe it would, particularly for iterative models. But it's a lot of hard work. Hope this helps.

mountain commented 3 years ago

Hi, Stephan. Thanks for your explanation. I also have two more questions

(1) The first question is about the number of constraints, I noticed that in the paper, you use 'column-integrated' conservation laws. Does it means you only consider the constraints in a very coarse integrated way? Is it possible to expands the constraints spatially and temporally which will leads to a PDE-like constraints form? This is the finest way to consider the conservation laws. Even if this understanding is correct, the finest way will face a very huge computational cost.

And also, I am not sure how these design will impact on training and generalization.

(2) I still think WeatherBench is very valuable just like MNIST in CV. Maybe we can expand the discussion a little bit to encourage more explorations in this direction which PRL paper leads. My background is from computer science and math, so my opinion might be wrong.

Maybe the easiest beginning step is to verify how the data in dataset fits the conservation laws

a) mass flux and the inhomogeneous coordinate along altitude

the detailed forms of the conservation laws are related with the coordinate system, spherical coordinate is not a very difficult problem, but it is inhomogeneous along altitude, which might lead some problems to calculate water flux. Meanwhile, the air mass flux seems to be not a problem, because the balance of coriolis force and pressure gradients?

b) energy balance among surface and atmosphere

By using Weatherbench data, there are some interesting beginner-level questions to study

mountain commented 3 years ago

Thanks.

raspstephan commented 3 years ago

To answer your questions:

1) In the PRL paper we only consider constraints in a single atmospheric column. This is because we are only emulating atmospheric convection and the underlying simulations from the super-parameterization only act on each column individually, so there is no need to consider spatial conservations. If you want to apply something similar to WeatherBench, you would have to consider spatial transfer of energy, etc. as well.

2) There are very interesting directions. Let me know if you are doing anything in this direction. I would be very interested.