Closed MarkWieczorek closed 4 years ago
And here is a simple script that demonstrates this problem:
# create a grid that spans 0 359 E, and -89 to 90 N.
longitude = np.arange(0, 360, 1)
latitude = np.arange(-89, 91, 1)
x = np.sin(np.deg2rad(longitude))
y = np.linspace(start=0, stop=1, num=180)
data = y[:, np.newaxis] * x
# create a DataArray, and then export this as a netcdf file
dataarray = xr.DataArray(data, coords=[('latitude', latitude,
{'units': 'degrees_north'}),
('longitude', longitude,
{'units': 'degrees_east'})],
attrs = {'actual_range': [-1, 1]})
dataset = dataarray.to_dataset(name='dataarray')
dataset.to_netcdf('test.grd')
fig = pygmt.Figure()
# create a projected image using the DataArray in memory and the netcdf file
fig.grdimage(dataset.dataarray, region='g', projection='A0/0/6i')
fig.grdimage('test.grd', region='g', projection='A0/0/6i', X='6.2i')
This might be related to pixel registered vs grid node registered grids. GMT makes a lot of internal assumptions when reading the netCDF file that we don't when converting the xarray.DataArray.
If I use the -V
option with the file test.grd
, gmt warns me that it assumes it is gridline registered. If I define the longitude attribute actual_range = [0, 359]
, then I don't get this warning (in this case, the gmt code confirms that it is gridline and doesn't need to make any guesses). In either case, defining actual_range
or not produces the same output.
As for pygmt, grdimage creates a virtual file using
file_context = lib.virtualfile_from_grid(grid)
And here is the code for this function
matrix, region, inc = dataarray_to_matrix(grid)
family = "GMT_IS_GRID|GMT_VIA_MATRIX"
geometry = "GMT_IS_SURFACE"
gmt_grid = self.create_data(
family, geometry, mode="GMT_CONTAINER_ONLY", ranges=region, inc=inc
)
self.put_matrix(gmt_grid, matrix)
args = (family, geometry, "GMT_IN|GMT_IS_REFERENCE", gmt_grid)
with self.open_virtual_file(*args) as vfile:
yield vfile
If you look at create_data()
, you will see that there is an optional keyword registration
. However, this keyword is NOT used in the routine. There is a variable registration_int
, but this appears to be hardcoded to be GMT_GRID_NODE_REG
So, to the best of my knowledge, pygmt is assuming that all xarrays are gridline registered. If this is the case, this is a bug.
There are probably one of three ways to resolve this problem:
pygmt
: pygmt should first explicitly allow the user to define if the data are gridline or pixel registered. If they are gridline registered, it should then check if the redundant boundary condition is provided, and if not to add an extra row to the internal data. I am not sure where the code should be modified for this, but perhaps dataarray_to_matrix
would be the best.pygmt
: The virtualfile_from_grid
should return a virtual netcdf
file (which gmt can parse just fine). My assumption is that the virtual file is just an arbitrary binary datafile that gmt
is having a hard time understanding how to decode.
gmt
: gmt's grdimage appears to have a bug when parsing pygmt's virtual_file
input. If this is the case, we should open an issue there. I don't know enough about how gmt reads such generic files to debug this myself. Perhaps it is possible to specify in the vitrual file that the redundant longitude is missing.Edit: by the way, I think that pygmt routines should have a keyword verbose=True
which corresponds to -V
Could you let me know how gmt reads the pygmt virtual files? I looked through the gmt code a little, and I am going to guess that the file format is id=bf: GMT native, C-binary format (32-bit float)
. If I had to guess, this file format probably gets parsed by the function gmtio_bin_input
that is in the file gmt_io.c
. After confirmation, I will open an issue at gmt to see if this is something that they should fix.
Also, it appears that this doesn't affect all map projections. Here the same image as above, but in a mollweide projection:
Ah, the ol' gridline vs pixel based registration problem. I think you hit the nail on the head there @MarkWieczorek. I'm pretty sure xarray
uses centre-based coordinates (i.e. pixel-based registration) and if pygmt defaults to gridline registration, then this is definitely a serious issue. I've hit this time and time again before (e.g. at https://github.com/weiji14/deepbedmap/pull/150) and it just keeps coming back to haunt me...
Not sure if setting the default registration to pixel-based would solve the issue (since we're using xarray, it might be necessary?), but I'll try and look into it tomorrow if I have some mental bandwidth to do so (running a "FOSS4G Community Day" for PyGMT) Edit: Putting the Windows installation issue on a higher priority. If you've found a fix though, feel free to submit a Pull Request and I can review it.
I don't think that this is a pixel related issue. pygmt uses the routine dataarray_to_matrix
to extract the data maxtrix from an xarray DataArray and then returns this unmodified as a numpy array. As far as I can tell, pygmt assumes that all DataArrays are gridline registered, and the example I provide above is for gridline registered data.
As you say, xarray used center-based coordinates, but this corresponds to gridline registration in gmt. (The coordinates are at the grid line intersections that occur at the center of the pixel, not the edges).
It is possible that gmt is misinterpreting the virtual file as pixel registered, and if this is the case, this is a bug with gmt. It is conceivable that gmt uses the ration of nlat and nlon to guess the registration, and since my grid doesn't include the redundant longitude at 360E, I could perhaps see this being a problem if gmt has to guess the registration.
Hi @MarkWieczorek, sorry for not following up on this soon enough. I think we've found the issue, and it's probably because the virtualfile mechanism assumes all xarray grids as Cartesian instead of Geographic (see https://github.com/GenericMappingTools/pygmt/pull/476#issuecomment-647755091).
We're working on allowing users to tell PyGMT whether their grid is Cartesian/Geographic, and Pixel/Gridline registered. This might take some time to implement as there's a lot of issues to iron out, and the team will be busy with GMT 6.1.0. But you can provide some feedback on #487 which decides what the recommended default will be going forward.
grdimage
of pygmt does not correctly plot xarray DataArrays when they do not contain the redundant column of data at 360 E.Using pygmt directly with a netcdf file provides the correct output, such as with a command like:
However, here is the same output generated by pygmt when using an xarray in memory: