mllam / neural-lam

Neural Weather Prediction for Limited Area Modeling
MIT License
64 stars 24 forks source link

Grid Dimension Ordering #46

Closed bet20ICL closed 3 weeks ago

bet20ICL commented 4 weeks ago

Hi, I ran into some issues using create_mesh.py for my dataset. I'm wondering if there might be a bug in the documentation for the code?

The code comments consistently say that a sample has dimensions (N_t', dim_x, dim_y, d_features'). https://github.com/mllam/neural-lam/blob/9d558d1f0d343cfe6e0babaa8d9e6c45b852fe21/neural_lam/weather_dataset.py#L88-L90

Similarly, nwp_xy.npy, the coordinates of each grid point also has dimensions in the same order (2, N_x, N_y). https://github.com/mllam/neural-lam/blob/9d558d1f0d343cfe6e0babaa8d9e6c45b852fe21/create_grid_features.py#L29-L32

Now, these lines in create_mesh.py imply that the first slice of nwp_xy.npy hold the x coordinates, and vice versa. https://github.com/mllam/neural-lam/blob/9d558d1f0d343cfe6e0babaa8d9e6c45b852fe21/create_mesh.py#L110-L112

I now refer to nwp_xy.npy as grid_xy. Therefore, given the dimension ordering (2, N_x, N_y), I would reason that grid_xy[0][0] is a length N_y vector holding the x coordinates of vertical slice of the dataset. I would expect grid_xy[0][0] to have the same x coordinate. However, I see that isn't the case.

grid_xy = np.load("data/meps_example/static/nwp_xy.npy")
print(grid_xy.shape)
print(grid_xy[0][0].shape)
print(grid_xy[0][0])

>>>
(2, 268, 238)
(238,)
array([-1.0601221e+06, -1.0501221e+06, -1.0401222e+06, -1.0301222e+06,
...

Trying to create a mesh with my own dataset with the ordering (2, N_x, N_y) gives me a buggy graph. However, when changing the ordering to (2, N_y, N_x), the graph seems fine.

Is this an issue with the code comments? Should the correct dimension ordering be N_y, N_x in all cases instead of N_x, N_y?

joeloskarsson commented 4 weeks ago

Hi, thanks for pointing this out and pointing to the different places in the code where it is used. This is indeed inconsistent and unclear. I think that when I wrote WeatherDataset and create_grid_features.py I did not put much thought into what dimension to name x or y. The result of this is that if we follow:

https://github.com/mllam/neural-lam/blob/9d558d1f0d343cfe6e0babaa8d9e6c45b852fe21/neural_lam/weather_dataset.py#L19-L21

then xy = np.load(nwp_xy.npy) is a (2, dim_x, dim_y) array with xy[0] containing y-coordinates and xy[0] containing x-coordinates. That does not make much sense.

I think that this also goes against the actual coordinate system used in MEPS. Will check with Tomas that this

https://github.com/mllam/neural-lam/blob/9d558d1f0d343cfe6e0babaa8d9e6c45b852fe21/neural_lam/weather_dataset.py#L19-L21

should actually be

dim_y = 268
dim_x = 238
N_grid = 268x238 = 63784

I also think @sadamov had the same issue at some point? I remember the x/y ordering causing issues when you made your first graph?

This documentation issue should be easy to fix by changing all the comments in the files. In the long run, I hope that these kinds of issues will be avoided as we are moving to xarray-based input data handling.

TomasLandelius commented 3 weeks ago

Yes,

dim_y = 268
dim_x = 238
N_grid = 268x238 = 63784

is the correct description.

joeloskarsson commented 3 weeks ago

Great! I'll take on the task to update these comments to the correct ordering. I'll try to go through and check if there are additional places (maybe in the plotting?) where this is also inconsistent.

@bet20ICL would you be up for reviewing that pull request? Since you noted this and have an understanding for the problem already. I can add you to the organization so you are able to.

bet20ICL commented 3 weeks ago

Yes I am happy to!

leifdenby commented 3 weeks ago

@bet20ICL I just want to say welcome too! And thanks for your contribution :)

I don't know if you are aware of it, but there is a little community that has formed around the neural-lam codebase and we have monthly calls discussing our progress. You would be very welcome to join us if you would like to. It would be really cool to hear what you've been doing with neural-lam. You can find more details in our "development plan" google doc: http://bit.ly/mllam-plan, you can find the details of the next meeting there too.

bet20ICL commented 3 weeks ago

Thanks so much for the invite @leifdenby! I will definitely stop by to have a chat.