Closed bet20ICL closed 3 weeks ago
Hi, thanks for pointing this out and pointing to the different places in the code where it is used. This is indeed inconsistent and unclear. I think that when I wrote WeatherDataset
and create_grid_features.py
I did not put much thought into what dimension to name x or y. The result of this is that if we follow:
then xy = np.load(nwp_xy.npy)
is a (2, dim_x, dim_y)
array with xy[0]
containing y-coordinates and xy[0]
containing x-coordinates. That does not make much sense.
I think that this also goes against the actual coordinate system used in MEPS. Will check with Tomas that this
should actually be
dim_y = 268
dim_x = 238
N_grid = 268x238 = 63784
I also think @sadamov had the same issue at some point? I remember the x/y ordering causing issues when you made your first graph?
This documentation issue should be easy to fix by changing all the comments in the files. In the long run, I hope that these kinds of issues will be avoided as we are moving to xarray-based input data handling.
Yes,
dim_y = 268
dim_x = 238
N_grid = 268x238 = 63784
is the correct description.
Great! I'll take on the task to update these comments to the correct ordering. I'll try to go through and check if there are additional places (maybe in the plotting?) where this is also inconsistent.
@bet20ICL would you be up for reviewing that pull request? Since you noted this and have an understanding for the problem already. I can add you to the organization so you are able to.
Yes I am happy to!
@bet20ICL I just want to say welcome too! And thanks for your contribution :)
I don't know if you are aware of it, but there is a little community that has formed around the neural-lam codebase and we have monthly calls discussing our progress. You would be very welcome to join us if you would like to. It would be really cool to hear what you've been doing with neural-lam. You can find more details in our "development plan" google doc: http://bit.ly/mllam-plan, you can find the details of the next meeting there too.
Thanks so much for the invite @leifdenby! I will definitely stop by to have a chat.
Hi, I ran into some issues using
create_mesh.py
for my dataset. I'm wondering if there might be a bug in the documentation for the code?The code comments consistently say that a sample has dimensions (N_t', dim_x, dim_y, d_features'). https://github.com/mllam/neural-lam/blob/9d558d1f0d343cfe6e0babaa8d9e6c45b852fe21/neural_lam/weather_dataset.py#L88-L90
Similarly,
nwp_xy.npy
, the coordinates of each grid point also has dimensions in the same order (2, N_x, N_y). https://github.com/mllam/neural-lam/blob/9d558d1f0d343cfe6e0babaa8d9e6c45b852fe21/create_grid_features.py#L29-L32Now, these lines in
create_mesh.py
imply that the first slice ofnwp_xy.npy
hold the x coordinates, and vice versa. https://github.com/mllam/neural-lam/blob/9d558d1f0d343cfe6e0babaa8d9e6c45b852fe21/create_mesh.py#L110-L112I now refer to
nwp_xy.npy
asgrid_xy
. Therefore, given the dimension ordering (2, N_x, N_y), I would reason thatgrid_xy[0][0]
is a length N_y vector holding the x coordinates of vertical slice of the dataset. I would expectgrid_xy[0][0]
to have the same x coordinate. However, I see that isn't the case.Trying to create a mesh with my own dataset with the ordering (2, N_x, N_y) gives me a buggy graph. However, when changing the ordering to (2, N_y, N_x), the graph seems fine.
Is this an issue with the code comments? Should the correct dimension ordering be N_y, N_x in all cases instead of N_x, N_y?