Closed dafyddstephenson closed 2 months ago
update: removing the call to checkdims
leads to
ERROR::
5
ncread read var= h
NetCDF: Start+count exceeds dimension bound
ERROR STOP
which suggests that the variable shapes are indeed incorrect. I'm not sure why, the equivalent files I produced with MATLAB and partit have the same dimension sizes.
Is it possible that the labels .0.nc
, .1.nc
, ... are mixed up?
Just ran a quick script to print the shape of h
on each tile for the MATLAB/partit and python/roms-tools files.
Seems like the arrays are transposed?
NODE 0:
MATLAB SHAPE: (9, 9) | PYTHON SHAPE: (9, 9)
NODE 1:
MATLAB SHAPE: (9, 8) | PYTHON SHAPE: (8, 9)
NODE 2:
MATLAB SHAPE: (9, 9) | PYTHON SHAPE: (9, 9)
NODE 3:
MATLAB SHAPE: (8, 9) | PYTHON SHAPE: (9, 8)
NODE 4:
MATLAB SHAPE: (8, 8) | PYTHON SHAPE: (8, 8)
NODE 5:
MATLAB SHAPE: (8, 9) | PYTHON SHAPE: (9, 8)
NODE 6:
MATLAB SHAPE: (9, 9) | PYTHON SHAPE: (9, 9)
NODE 7:
MATLAB SHAPE: (9, 8) | PYTHON SHAPE: (8, 9)
NODE 8:
MATLAB SHAPE: (9, 9) | PYTHON SHAPE: (9, 9)
... however the dimensions themselves aren't transposed:
TILE 0:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 9}
TILE 1:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 8} | PYTHON SIZES: {'eta_rho': 8, 'xi_rho': 9}
TILE 2:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 9}
TILE 3:
MATLAB SIZES: {'eta_rho': 8, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 8}
TILE 4:
MATLAB SIZES: {'eta_rho': 8, 'xi_rho': 8} | PYTHON SIZES: {'eta_rho': 8, 'xi_rho': 8}
TILE 5:
MATLAB SIZES: {'eta_rho': 8, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 8}
TILE 6:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 9}
TILE 7:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 8} | PYTHON SIZES: {'eta_rho': 8, 'xi_rho': 9}
TILE 8:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 9}
Seems like the arrays are transposed?
I didn't realise ROMS would care about transposition. We can change partition
to transpose before saving.
... however the dimensions themselves aren't transposed
In my mental model dimensions are an unordered set.
EDIT: Wait your second printout there implies that the dimensions aren't just transposed, the names are assigned incorrectly?
In my mental model dimensions are an unordered set.
Typically ROMS reads netCDF with the dimensions in a particular order. I was initially suprised to see that this wasn't the problem here, although it makes sense given that checkdims
is one of the few routines in ROMS that explicitly checks dimensions rather than just reading data as they appear.
the names are assigned incorrectly?
Yes, that's the impression I get. I assumed the dimensions would have the correct size but appear in the wrong order (see above), but in fact it seems they're being defined incorrectly altogether.
I doubt that xarray swaps dimension names when doing a slicing operation. As stated above, my suspicion that the ordering of the nodes is done inconsistently with what ROMS expects. Taking this example:
TILE 0:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 9}
TILE 1:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 8} | PYTHON SIZES: {'eta_rho': 8, 'xi_rho': 9}
TILE 2:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 9}
TILE 3:
MATLAB SIZES: {'eta_rho': 8, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 8}
TILE 4:
MATLAB SIZES: {'eta_rho': 8, 'xi_rho': 8} | PYTHON SIZES: {'eta_rho': 8, 'xi_rho': 8}
TILE 5:
MATLAB SIZES: {'eta_rho': 8, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 8}
TILE 6:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 9}
TILE 7:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 8} | PYTHON SIZES: {'eta_rho': 8, 'xi_rho': 9}
TILE 8:
MATLAB SIZES: {'eta_rho': 9, 'xi_rho': 9} | PYTHON SIZES: {'eta_rho': 9, 'xi_rho': 9}
Maybe the python partitioning tool swapped tiles 1 and 5, etc.
Looks like we use row-wise (C-style) ordering. Maybe ROMS expects column-wise (Fortran-style) ordering?
I bet @sdbachman knows the answer to the question in the previous post because his OTT code uses the same tiling convention (I think).
My OTT does doesn't do tiling at all, but I think it is sensible to expect that ROMS expects column-major ordering. Have you tried just making a column-major example and seeing if it works?
I was hoping that we can get a definite answer to what ROMS expects first. The danger with just trying without knowing is that it may "work" (i.e., no errors) but ROMS will compute in a domain whose tiles sit in the wrong places. 🧩
I think we will have to look at the partit
generated tiles to get that definite answer. Who is working on this? @TomNicholas?
file_number = i + (j * nx)
I managed to get it working by flipping i
and j
here. The output looks fine, the mask is assembled correctly and the tracer patterns look reasonable
@dafyddstephenson your solution works for nx = ny
but to switch to column-major ordering for the general case, we will have to do
for i in range(nx):
for j in range(ny):
file_number = j + (i * ny)
I will go ahead and issue a PR with that solution. But we need to test it for several nx
and ny
and verify that the mask is still assembled correctly.
Ah yeah, good point. I'll admit I didn't think much about the broader implications once the first thing I tried worked lol, lesson learned. I'll try 8 CPUs with nx=4
, ny=2
OK I can get it to run with your second suggestion, but only if NP_XI
in ROMS corresponds to ny
in roms-tools
and NP_ETA
in ROMS corresponds to nx
in roms-tools
. I found this in the partition
docstring, suggesting this is intentional (@TomNicholas ?):
nx : int, optional
The number of partitions along the `eta` (latitude) direction. Must be a positive integer. Default is 1.
ny : int, optional
The number of partitions along the `xi` (longitude) direction. Must be a positive integer. Default is 1.
sorry if I've just had too long a week, but this seems the wrong way around to me? I associate x with long and y with lat. Perhaps to avoid any ambiguity we should use/accept the ROMS parameter names of np_xi
and np_eta
?
Your suggestion makes sense to me @dafyddstephenson! Are you submitting a PR, or do you want me to do it?
I haven't actually forked it yet so it'd be quickest for now if you did I think
sorry if I've just had too long a week, but this seems the wrong way around to me? I associate x with long and y with lat.
I probably just got confused and messed this up! My tests were really checking that we got the correct [10, 9, 9, ..., 9, 10] kind of sizes.
After running
roms_grd.save('roms_grd.nc',nx=3,ny=3)
I get this output from ROMS: