GenericMappingTools / gmt

The Generic Mapping Tools
https://www.generic-mapping-tools.org
Other
858 stars 359 forks source link

grdinterpolate not working from externals #5202

Closed joa-quim closed 3 years ago

joa-quim commented 3 years ago

Should have tried this before.

julia> G = grdinterpolate("c:/progs_cygw/GMTdev/gmt5/master/test/grdinterpolate/cube.nc", T=4, Vd=1)
        grdinterpolate c:/progs_cygw/GMTdev/gmt5/master/test/grdinterpolate/cube.nc  -n+a -T4
grdinterpolate [ERROR]: Option -G: Must specify output grid file

and

julia> G = grdinterpolate("c:/progs_cygw/GMTdev/gmt5/master/test/grdinterpolate/cube.nc", T=4, G="lixo.grd", Vd=1)
        grdinterpolate c:/progs_cygw/GMTdev/gmt5/master/test/grdinterpolate/cube.nc  -n+a -Glixo.grd -T4
ERROR: get_grid: programming error, output matrix is empty

Changing the KEYs to ``"<G{+,GG}" lets last command work but then the first one (without the G) crashes.

joa-quim commented 3 years ago

And

julia> G = grdinterpolate("c:/progs_cygw/GMTdev/gmt5/master/test/grdinterpolate/cube.nc", S="5/5", Vd=1)
        grdinterpolate c:/progs_cygw/GMTdev/gmt5/master/test/grdinterpolate/cube.nc  -n+a -S5/5
> Location 5,5
5       5       25      1
5       5       27.5    2
5       5       30      3
5       5       35      5
Vector{GMT.GMTdataset} with 0 segments

julia> G
Vector{GMT.GMTdataset} with 0 segments

Here output seems to be printed directly in stdout.

PaulWessel commented 3 years ago

Will check after tennis

-- Paul Wessel, Professor and Chair Dept. of Earth Sciences SOEST, U of Hawaii at Manoa

On May 8, 2021 at 9:05:13 AM, Joaquim @.***) wrote:

And

julia> G = grdinterpolate("c:/progs_cygw/GMTdev/gmt5/master/test/grdinterpolate/cube.nc", S="5/5", Vd=1) grdinterpolate c:/progs_cygw/GMTdev/gmt5/master/test/grdinterpolate/cube.nc -n+a -S5/5

Location 5,5 5 5 25 1 5 5 27.5 2 5 5 30 3 5 5 35 5 Vector{GMT.GMTdataset} with 0 segments

julia> G Vector{GMT.GMTdataset} with 0 segments

Here output seems to be printed directly in stdout.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GenericMappingTools/gmt/issues/5202#issuecomment-835474866, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJ7IX4DLLSZJKW3W5FG2I3TMWDOTANCNFSM44NWIMBQ .

PaulWessel commented 3 years ago

Apparently freeing the wrong memory but does not crash on macOS, making it hard to debug for me.

PaulWessel commented 3 years ago

Just to keep something in this issue as well, this apparently crashes Matlab on WIndows:

G = gmt('grdinterpolate cube.nc -T4');

with cube.nc found in the test/grdinterpolate folder.

PaulWessel commented 3 years ago

Hi @joa-quim, does it also crash if we actually request a subset cube instead of a single grid layer?

C = gmt('grdinterpolate cube.nc -T2/4/0.5');

PaulWessel commented 3 years ago

I am thinking that we are writing a cube with one layer, so a grid, and perhaps there is confusion in that. I do not think (not checked yet) that we switch the KEYS from cube to grid if only one layer is found, and I am not sure we can since that is detected much later in the processing.

PaulWessel commented 3 years ago

OK, I think I understand the problem, at least for the example above: Thegrdinterpolate cube.nc -T4 command is just pulling out a grid slice for z = 4, but grdinterpolate calls

if (GMT_Write_Data (API, GMT_IS_CUBE, GMT_IS_FILE, GMT_IS_VOLUME, GMT_CONTAINER_AND_DATA, NULL, Ctrl->G.file, C[GMT_OUT]))
    Return (EXIT_FAILURE);

So it is writing a one-layer cube. Yet, _GMT_EncodeOptions changes the KEYS to have a type of grid (G) when it finds one layer:

    if (E == NULL && S == NULL) {   /* Interpolating onto a single grid or a new cube */
        if (T == NULL && n_files > 1)   /* Repackaging multiple grids as a single cube */
            type = 'U';
        else if (T && (strchr (T->arg, '/') || strchr (T->arg, ',')))   /* Making more than one output layer */
            type = 'U';
        else
            type = 'G'; /* Making a single grid layer */
    }

So a cube is passed but a grid is expected, and a crash ensues. I think one solution is for grdinterpolate to write a grid when there is only one layer, since then it matches the external expectation. However, I can also see someone saying this is wrong since in general (with a -Tmin/max/inc) we will be writing a cube, so why make a special case for the one-layer cube. This would also affect PyGMT when they wrap grdinterpolate. So (unfortunately) we are dealing with a module that can return either a dataset, a grid, or a cube. What solution is reasonable here, @joa-quim and @seisman ? Perhaps the externals will want to wrap this module in different ways depending on purpose, e.g, grd2cube could be the one that simply makes a cube from grids, cubelayer returns a grid, while cubeinterpolate returns a cube, and cubesample returns a dataset?

PaulWessel commented 3 years ago

Just confirmed that grdinterpolate cube.nc -T4 on the command line will write a regular grid since there is only one layer. This is not what happens when we want the result back to the external caller since in that case we pass a cube structure. The problem then is the mixup of data types. Because a cube is represented by a grid header an a stack of grid I can see why this still worked in Matlab on macos but could crash (and probably should) on other OS.

We need to decide what to do here. I do not know if GMT.jl or PyGMT have data types they want to use to hold a cube.

seisman commented 3 years ago

Currently, PyGMT doesn't define any special data types to hold a grid, cube or dataset, so the decision should make no difference for PyGMT, but ping @weiji14 to see if you have more experience with the xarray package.

weiji14 commented 3 years ago

Currently, PyGMT doesn't define any special data types to hold a grid, cube or dataset, so the decision should make no difference for PyGMT, but ping @weiji14 to see if you have more experience with the xarray package.

So Python xarray allows for 2D grids (M x N x 1, xarray.DataArray) and 3D or n-dimensional cubes (M x N x C, xarray.Dataset). I.e. an xarray.Dataset can hold a stack of 2D xarray.DataArray grids. The Python xarray.Dataset is modelled after netCDF so should handle just about any n-dimensional data array structure.

For the purposes of wrapping grdinterpolate in PyGMT, we could probably just throw things into an xarray.Dataset structure that can hold one grid or a stack of grids (i.e. a cube). So as @seisman said, the output format of grdinterpolate shouldn't matter too much for PyGMT :crossed_fingers:

PaulWessel commented 3 years ago

OK. good. So it comes down to how GMT.jl and GMTMEX wraps things. I think gmtmex has not even added support for receiving a cube but should be straight forward. It seems it has to return a cube even if there is only one layer, @joa-quim.

joa-quim commented 3 years ago

I don't think return one-layer cube is the best idea. First, singleton dimensions are always a headache to deal with, second if we are doing a cross section we are certainly not expecting to get a cube back.

But the cause of the crash, as I said in the other topic, it's because at the time of writing the grid back to the caller the GH->extra has trash that I was not able o find out where it comes from. In the consequence of that trash it tries to free some memory that does not exist and boom. This should be reproducible in MacMatlab as well.

joa-quim commented 3 years ago

Oops I wrote that in the morning but did not push the button. I have implemented the the CUBE in Julia and it now crashes when freeing gmt_M_free (GMT, U->hidden); or gmt_M_free (GMT, G->hidden);

PaulWessel commented 3 years ago

OK, and I am implementing CUBE in mex. But surely, passing a cube structure back to the external yet calling it a grid is the root of the crash.

PaulWessel commented 3 years ago

So I am doing the same as you, probably: If the received cube has one layer then I return the mex grid structure instead of the (new) cube structure. But family zoom is interfering with coding again.

joa-quim commented 3 years ago

This receives a cube but crashes too (in gmtlib_free_cube_ptr)

C = grdinterpolate("cube.nc", T="2/4/0.5");
PaulWessel commented 3 years ago

First attempt before family zoom shows a promising warning and the return of a grid:

>> U = gmt ('grdinterpolate cube.nc -T4')
Matlab (gmt_grdio.c:3499(gmtlib_free_cube_ptr)): Wrongly tries to free item

U = 

  struct with fields:

               z: [11x11 single]
               x: [0 1 2 3 4 5 6 7 8 9 10]
               y: [0 1 2 3 4 5 6 7 8 9 10]
           range: [0 10 0 10 NaN NaN]
             inc: [1 1]
    registration: 0
          nodata: NaN
           title: ''
         comment: ''
         command: ''
        datatype: 'float32'
          x_unit: 'x'
          y_unit: 'y'
          z_unit: 'z'
          layout: 'TCF'
           proj4: ''
             wkt: ''
PaulWessel commented 3 years ago

And because I get the same warning running GMT CLI I can debug that memory issue without max.

joa-quim commented 3 years ago

layout: 'TCF'

What's F?

Different implementations. When I try

G = grdinterpolate("cube.nc", T=4)

It blows in gmtlib_free_grid_ptr() but before it warned that (same as in cube case)

image

PaulWessel commented 3 years ago

F? Not sure, I did copy/paste from the grid function. Maybe typo? Sorry, family zoom still ongoing.

PaulWessel commented 3 years ago

I am doing this wrong, perhaps?. You know the answer to this - what should I change?

    /* Create a MATLAB struct to hold this cube [matrix will be a float (mxSINGLE_CLASS)]. */
    U_struct = mxCreateStructMatrix (1, 1, N_MEX_FIELDNAMES_CUBE, GMTMEX_fieldname_cube);
    dim[0] = U->header->n_bands;    dim[1] = U->header->n_columns;  dim[2] = U->header->n_rows;
    mxptr[0]  = mxCreateNumericArray (ndim, dim, mxSINGLE_CLASS, mxREAL);
joa-quim commented 3 years ago

It's been a while since I looked into the Matlab C API. What's wrong with it?

PaulWessel commented 3 years ago

It may be correct. I think the first dim is the slowest moving (layer), then columns, then rows in Matlab. I get

>> U = gmt ('grdinterpolate cube.nc -T3/4/0.25')

U = 

  struct with fields:

               w: [5x11x11 single]
               x: [0 1 2 3 4 5 6 7 8 9 10]
               y: [0 1 2 3 4 5 6 7 8 9 10]
               z: [3 3.2500 3.5000 3.7500 4]
           range: [0 10 0 10 3 4 0 120]
             inc: [1 1 0.2500]
    registration: 0
          nodata: NaN
           title: ''
         comment: ''
         command: 'grdinterpolate cube.nc -T3/4/0.25 -G@GMTAPI@-S-O-U-U-U-N-000009'
        datatype: 'float32'
          x_unit: 'x'
          y_unit: 'y'
          z_unit: ''
          w_unit: 'z'
          layout: 'TCF'
           proj4: ''
             wkt: ''

and when I type U.w it splits things up this way:

>> U.w

  5x11x11 single array

ans(:,:,1) =

         0         0         0    4.8000   10.8000    7.2000   19.2000    7.2000   25.2000    4.8000   28.8000
         0         0         0    6.0000   12.0000    9.6000   21.6000   10.8000   28.8000    9.6000   33.6000
         0         0    1.2000    7.2000         0   12.0000   24.0000   14.4000   32.4000   14.4000   38.4000
         0         0    2.4000    8.4000    2.4000   14.4000         0   18.0000   36.0000   19.2000   43.2000
         0         0    3.6000    9.6000    4.8000   16.8000    3.6000   21.6000         0   24.0000   48.0000

I was expecting 5 layers, not 11 whatevers.

joa-quim commented 3 years ago

I think it should be

dim[2] = U->header->n_bands; dim[1] = U->header->n_columns; dim[0] = U->header->n_rows;

PaulWessel commented 3 years ago

Yep, did that and it is correct.

PaulWessel commented 3 years ago

Let me know if you still have any mex or Julia problems with grdinterpolate after the latest GMT API updates (one PR pending your review). I think those should have fixed the problems with the cube/grid pointer and the freeing.

joa-quim commented 3 years ago

I get rubish but no more crashes. Will look at the subish tomorrow.

PaulWessel commented 3 years ago

Was that mex or Julia? In Julia you have to make sure you receive what grdinterpolate returns using a GMT_CUBE even if it is a single layer, unless you do -E or -S.

PaulWessel commented 3 years ago

Pinging @joa-quim for an update.

joa-quim commented 3 years ago

I can get the data in Julia. Just have to decide how to deal with it. I don't want the pad to force me to make a cube copy.

PaulWessel commented 3 years ago

Thanks, I will then close this comment since fixed on the GMT side.