Open seisman opened 4 years ago
It may be a GMTDataArrayAccessor bug, or maybe we should update
grdcut()
to work with GMTDataArrayAccessor. The problem is that the temporary netCDF file is already deleted when GMTDataArrayAccessor tries to access it.
Good spotting, this won't be specific to just grdcut
, but the other grd*
modulles as well. It will likely require some modification to these lines:
There's 2 solutions I can see:
with
block, so that the registration
and gtype
information is stored before the file disappears.Which would you prefer? Or is there a third way to resolve this?
Don't delete the tempfile, maybe keep it somewhere while the session is still running.
We probably don't want to maintain a list of temporary files.
Trigger the GMTDataArrayAccessor inside the with block, so that the registration and gtype information is stored before the file disappears.
It sounds a good way, but how to trigger it?
BTW, at least we need to check if the netCDF source exists in GMTDataArrayAccessor.
It sounds a good way, but how to trigger it?
Simply by calling grid.gmt.registration
inside the with
block. That would then store the registration information in the class.
BTW, at least we need to check if the netCDF source exists in GMTDataArrayAccessor.
Yes, the error message could be improved, will see how best to do this in the PR.
Simply by calling
grid.gmt.registration
inside thewith
block. That would then store the registration information in the class.
Do you mean the following code?
with xr.open_dataarray(outgrid) as dataarray:
result = dataarray.load()
result.gmt.registration
result.gmt.gtype
It's a little weird.
I think just result.gmt
will do, but yes it's weird (and I'm sure pylint will complain about some useless-statement
error). We'll also need to apply this to every grd*
function that loads a temporary netcdf file. Maybe there's a better solution?
I think just
result.gmt
will do, but yes it's weird (and I'm sure pylint will complain about someuseless-statement
error). We'll also need to apply this to everygrd*
function that loads a temporary netcdf file. Maybe there's a better solution?
Perhaps we can add result.gmt
as a temporary workaround and see how it can be improved in the future.
Issues #496 and #489 can't be fixed without the workaround.
Ok, I'll try and do that tonight after dinner.
Keep the issue open, since we don't really like the workaround.
It still doesn't work for a slice of a grid. For example:
In [1]: import pygmt
In [2]: grid = pygmt.grdcut("@earth_relief_01d", region="0/90/0/90")
In [3]: grid.gmt.registration
Out[3]: 1
In [4]: grid2 = grid[10:41,30:101]
In [5]: grid2
Out[5]:
<xarray.DataArray 'z' (lat: 31, lon: 60)>
array([[ 482. , 435. , 391.5, ..., -3402.5, -3340.5, -3281. ],
[ 731. , 571. , 393. , ..., -3303.5, -3228.5, -3176. ],
[ 543. , 490. , 395.5, ..., -3166.5, -3132. , -3066.5],
...,
[ 1312.5, 1136.5, 1058. , ..., 1514.5, 3121. , 3689.5],
[ 1104.5, 993.5, 1084.5, ..., 875.5, 815.5, 894.5],
[ 623. , 1156. , 1329. , ..., 861.5, 836. , 822. ]],
dtype=float32)
Coordinates:
* lon (lon) float64 30.5 31.5 32.5 33.5 34.5 ... 85.5 86.5 87.5 88.5 89.5
* lat (lat) float64 10.5 11.5 12.5 13.5 14.5 ... 36.5 37.5 38.5 39.5 40.5
Attributes:
long_name: elevation (m)
actual_range: [-5122. 5651.5]
In [6]: grid2.gmt.registration
grdinfo [ERROR]: File /var/folders/mz/d3wtwjtx4fg8qlds5lgt1c240000gn/T/pygmt-15alnpr7.nc was not found
grdinfo [ERROR]: Cannot find file /var/folders/mz/d3wtwjtx4fg8qlds5lgt1c240000gn/T/pygmt-15alnpr7.nc
grdinfo [ERROR]: Must specify one or more input files
Can't say I didn't see this one coming. I'm tempted to go with Solution 2 "Don't delete the tempfile, maybe keep it somewhere while the session is still running." now, since there's no easy way for to transfer the registration/gtype info from the original xarray
grid to the sliced xarray
grid.
Another option is to check explicitly if the *.nc
file exists as you mentions. But if it doesn't, and we fallback on the default gridline registration/Cartesian setting, that's not ideal either. We'll need to come up with a better solution somehow.
The ultimate goal is to avoid temporary files. It's only possible if PyGMT supports GMT data types
That's definitely a long term thing (for PyGMT > v0.3.0). Would need someone familiar with both Python and C I think, the new postdoc might the one to do it :laughing:
I'll need to do more investigation, but there's some discussion in xarray
about moving towards keeping attrs (attributes) by default for slice operations. See https://github.com/pydata/xarray/issues/1614 and https://github.com/pydata/xarray/issues/3891. Right now, it might be possible for users to use xr.set_options(keep_attrs=True)
(not tested, might not work, see http://xarray.pydata.org/en/v0.16.2/generated/xarray.set_options.html), and get the gmt accessor attributes preserved on slicing?
Anyways, I think approving PR #873 and solving #870 is fine for now. But we'll see if users come up with problems down the line.
Cross-referencing a related bug reported at https://forum.generic-mapping-tools.org/t/how-to-get-metadata-and-plot-grid-from-xarray-dataset/2010. In that case, an xarray.Dataset
was sliced into an xarray.DataArray
by selecting one coordinate, but passing the sliced xarray.DataArray
throws up grdinfo [WARNING]: Detected a data cube
because it is still referencing the original 3D netCDF data cube file, instead of the 2D xarray.DataArray
grid slice.
I ran into the following today when trying to use some data via OPeNDAP. The try/except strategy in the accessor leads to a successful figure, but gmt's error messages are confusing. If we are expecting the call to fail every once in a while and have a fall back, we could add verbose="q"
to the grdinfo call as a short term solution. https://github.com/GenericMappingTools/pygmt/blob/951d30aa9578b4108b768983c09208c021ec1d1f/pygmt/accessors.py#L30-L39
time = '2011-01-20T18:00:00'
param = 'vwnd'
level = 250
ds = xr.open_dataset(f'https://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/pressure/{param}.{time[:4]}.nc').sel({'level': level, 'time': time})
fig = pygmt.Figure()
fig.grdimage(ds['vwnd'], frame=True, projection="H15c")
fig.show()
Error messages:
grdinfo [ERROR]: Libcurl Error: HTTP response code said error
grdinfo [WARNING]: You can turn remote file download off by setting GMT_DATA_UPDATE_INTERVAL to "off"
grdinfo [ERROR]: Cannot find file vwnd.2011.nc
grdinfo [ERROR]: Cannot find file vwnd.2011.nc
grdinfo [ERROR]: File vwnd.2011.nc not found
[Session pygmt-session (51)]: Error returned from GMT API: GMT_FILE_NOT_FOUND (16)
For xarray.DataArray objects, currently PyGMT relies on the output of the grdinfo
call to determine the registration and gtype. If the grdinfo fails, PyGMT fallbacks to the registration=0 and gtype=0.
See what GMT does at https://github.com/GenericMappingTools/gmt/blob/e63b2590b68969908efc393d008a9d024617eed2/src/gmt_nc.c#L656.
It seems GMT has its own logic to detect a netCDF file's grid registration. I think PyGMT should follow the same logic. It would be easier if GMT can provide an API function for this.
Description of the problem
It may be a GMTDataArrayAccessor bug, or maybe we should update
grdcut()
to work with GMTDataArrayAccessor. The problem is that the temporary netCDF file is already deleted when GMTDataArrayAccessor tries to access it.Full code that generated the error
Full error message
System information
Please paste the output of
python -c "import pygmt; pygmt.show_versions()"
: