GenericMappingTools / gmt

The Generic Mapping Tools
https://www.generic-mapping-tools.org
Other
858 stars 359 forks source link

Running info on vector memory inputs return UNIX timestamps instead of ISO datetimes #4241

Closed weiji14 closed 3 years ago

weiji14 commented 4 years ago

Description of the problem

UNIX timestamps (e.g. 1577836800) are returned instead of ISO datetimes (e.g. 2020-01-01T00:00:00) when running info on virtual tables. Originally noticed at https://github.com/GenericMappingTools/pygmt/pull/619#discussion_r491636286, check also https://github.com/GenericMappingTools/gmt/pull/3396 where datetime inputs were originally accepted into GMT_Put_Vector.

Full script that generated the error

Sorry that I can't write C code, but given an input like so:

z time
10 2020-01-01T00:00:00
13 2020-01-02T00:00:00
12 2020-01-03T00:00:00
15 2020-01-04T00:00:00
14 2020-01-05T00:00:00

and then running info from PyGMT from the branch at https://github.com/GenericMappingTools/pygmt/pull/619 (each column is passed into C API using GMT_Put_Vector)

import pygmt

# create data array 'table' and pass it into `info`
pygmt.info(table=table, V="d")

Full error message

gmtinfo [INFORMATION]: Processing input table data
gmtinfo [DEBUG]: gmtapi_init_import: Passed family = Data Table and geometry = Point|Line|Poly
gmtinfo (gmtapi_init_import): tried to free unallocated memory
gmtinfo [DEBUG]: gmtapi_init_import: Added 1 new sources
gmtinfo [DEBUG]: GMT_Init_IO: Returned first Input object ID = 0
gmtinfo [DEBUG]: gmtapi_init_export: Passed family = Data Table and geometry = Point|Line|Poly
gmtinfo [DEBUG]: Object ID 1 : Registered Data Table File /tmp/pygmt-gykf5n3t.txt as an Output resource with geometry Point|Line|Poly [n_objects = 2]
gmtinfo [DEBUG]: gmtapi_init_export: Added 1 new destination
gmtinfo [DEBUG]: GMT_Init_IO: Returned first Output object ID = 1
gmtinfo [DEBUG]: GMT_Begin_IO: Mode value 1 not considered (ignored)
gmtinfo [DEBUG]: GMT_Begin_IO: Initialize record-by-record access for Input
gmtinfo [DEBUG]: gmtapi_next_io_source: Selected object 0
gmtinfo [INFORMATION]: Reading Data Table from Input memory location via vector
gmtinfo [DEBUG]: GMT_Begin_IO: Input resource access is now enabled [record-by-record]
gmtinfo [DEBUG]: GMT_Begin_IO: Initialize record-by-record access for Output
gmtinfo [DEBUG]: gmtapi_next_io_source: Selected object 1
gmtinfo [INFORMATION]: Writing Data Table to file /tmp/pygmt-gykf5n3t.txt
gmtinfo [DEBUG]: GMT_Begin_IO: Output resource access is now enabled [record-by-record]
gmtinfo [DEBUG]: Number of numerical output columns has been set to 0
gmtinfo [DEBUG]: GMT_End_IO: Input resource access is now disabled
gmtinfo [DEBUG]: GMT_End_IO: Output resource access is now disabled
gmtinfo (GMT_gmtinfo): tried to free unallocated memory
gmtinfo (GMT_gmtinfo): tried to free unallocated memory
gmtinfo [DEBUG]: gmtlib_unregister_io: Unregistering object no 1 [n_objects = 1]
gmtinfo (gmtlib_free_tmp_arrays): tried to free unallocated memory

Actual outcome

<vector memory>: N = 5 <10/15> <1577836800/1578182400>

Expected outcome

<vector memory>: N = 5 <10/15> <2020-01-01T00:00:00/2020-01-05T00:00:00>

System information

PaulWessel commented 4 years ago

Hm, yes since what is returned is a text record I agree there is no good reason it should not be formatted as you expect here. If -C was used then you would get a numerical record and it would have UNIX time of course, but here it is text. I will try to have a look as to why it is not doing the formatting.

PaulWessel commented 4 years ago

I would like to add another section to the DEVELOPER RESOURCES sidebar on debugging, for how to debug pyGMT via Xcode on macOS. Been a month since last time and cannot find all the step that @seisman helped with. Since I am a complete python rookie there is not much mental memory on even the most basics, so I need that recipe, which also had some issues with setting shared-dir and possibly library paths for all to work. Maybe @seisman could resurrect his instructions and I can turn it into a section in the developer RST?

PaulWessel commented 4 years ago

And @joa-quim has given similar instructions for setting up GMT.jl so I can debug cases he finds directly in Xcode as well?

weiji14 commented 4 years ago

I would like to add another section to the DEVELOPER RESOURCES sidebar on debugging, for how to debug pyGMT via Xcode on macOS. Been a month since last time and cannot find all the step that @seisman helped with. Since I am a complete python rookie there is not much mental memory on even the most basics, so I need that recipe, which also had some issues with setting shared-dir and possibly library paths for all to work. Maybe @seisman could resurrect his instructions and I can turn it into a section in the developer RST?

Issue is at #3778. I should probably start to learn C too at some point :laughing:

seisman commented 4 years ago

The steps to debug PyGMT via Xcode was discussed in #3829.

Here is a summary:

  1. Install PyGMT following the official instructions
    
    # add conda-forge channel
    conda config --prepend channels conda-forge
    # NOTE: the next step is different from the PyGMT official instructions, because we want to use the GMT dev version
    conda create --name pygmt python=3.8 pip numpy pandas xarray netcdf4 packaging
    # activate the pygmt envirionment
    conda activate pygmt

install pygmt in editable/development mode

cd pygmt make install

2. Compile GMT using Xcode 
3. Tell PyGMT where to find the GMT library by setting the environmental variable `GMT_LIBRARY_PATH`

export GMT_LIBRARY_PATH=~/Gits/gmt/gmt/build/xcode/src/Debug


4. Open Xcode 
5. Run a python console, attach the process id in xcode, and run PyGMT codes in the Python console and xcode will stop at the breakpoint.
joa-quim commented 4 years ago

Issue is at #3778. I should probably start to learn C too at some point

Please, please join the Force.

Debugging from any of the environments (Matlab, Julia, Python) should all the same procedure.

  1. Open the shell (ML IDE, Julia or Py REPL), type the command that will trigger the program to debug, but HOLD ON
  2. Open the debugger an Attach to process (this is VS talk, Xcode does similar but on its own way). The process is either the Matlab running process or Julia, Py shells.
  3. Open the source code of the program to be debugged ans set break points.
  4. Go back to 1. and hit RETURN
seisman commented 4 years ago

@weiji14 Perhaps we should merge https://github.com/GenericMappingTools/pygmt/pull/619 first (mark the tests as xfail) and provide a minimal Python script to reproduce the issue. Your example above is incomplete, and @PaulWessel will need to rewrite a working script before debugging.

joa-quim commented 4 years ago

Ofc, on Unix everything is more complicated :smiling_imp:

weiji14 commented 4 years ago

Please, please join the Force.

http://www.nooooooooooooooo.com/

Ofc, on Unix everything is more complicated :smiling_imp:

Yeah, I'm on Linux which doesn't seem to have Xcode :shrug:

@weiji14 Perhaps we should merge GenericMappingTools/pygmt#619 first (mark the tests as xfail) and provide a minimal Python script to reproduce the issue. Your example above is incomplete, and @PaulWessel will need to rewrite a working script before debugging.

Yep, give me a few minutes to finish up that PR.

PaulWessel commented 4 years ago

Yes, it would be great if a snipped of Python code demonstrating the problem is submitted since that is not something I will know how to do. Same with Julia of course and @joa-quim usually posts a Julia command or mex command. We just want to lower the threshold for all of us to do these things. Even C-novices should be able to step in the the C library and at least detect where a crash happens - often that is enough of a clue for the C-team to figure things out.

joa-quim commented 4 years ago

Good question for Linux. I guess that gcc should have an equivalent mechanism to Attach to process ID and from that on, the doing should be the same ... except that using gdb is an act of masochism.

PaulWessel commented 4 years ago

On Linux, I have done lots of debugging using ddd, and even on macOS before Apple switched to clang and it became a bit harder (and I have not paid too much attention). But I dont know if you can connect to the process in ddd the same way we do in Xcode or Visual Studio.

PaulWessel commented 4 years ago

ddd is just a graphical front-end to gdb which I assume can connect to other processes.

PaulWessel commented 4 years ago

@joa-quim, do you mind posting the usual setup for a newbie to install GMT.jl from the developer version, including installing julia. I know you have some of that on your site but it would help me put the docs together if you can (like @seisman) cut/paste in here while I am thinking about this.

weiji14 commented 4 years ago

Ok, https://github.com/GenericMappingTools/pygmt/pull/619 has been merged, and I'm starting to have a go at debugging using gdbgui (installed via pip install gdbgui). Will see if I can figure this out. Below is the Python code to run after following PyGMT installation instructions (pip install https://github.com/GenericMappingTools/pygmt/archive/master.zip), and launching Python:

import pygmt
import pandas as pd

table = pd.DataFrame(
    data={
        "z": [10, 13, 12, 15, 14],
        "time": pd.date_range(start="2020-01-01", periods=5),
    }
)
output = pygmt.info(table=table, V="d")
print(output)
# <vector memory>: N = 5 <10/15> <1577836800/1578182400>
PaulWessel commented 4 years ago

Hi @seisman, since I have done these before and have a pygmt directory:

# add conda-forge channel
conda config --prepend channels conda-forge
# NOTE: the next step is different from the PyGMT official instructions, because we want to use the GMT dev version
conda create --name pygmt python=3.8 pip numpy pandas xarray netcdf4 packaging
# activate the pygmt envirionment
conda activate pygmt

is there a simpler step now to just pull new pygmt and then proceed to the make install step?

seisman commented 4 years ago

is there a simpler step now to just pull new pygmt and then proceed to the make install step?

Just run the following commands to update your local pygmt:

cd pygmt 
git pull

you even don't need to install pygmt again.

then open a Python console, and type:

import pygmt
pygmt.show_versions()
PaulWessel commented 4 years ago

As for GMT/MEX there are some quirks we need to figure out and probably improve (this is for macOS - for Windows I think it is simpler because @joa-quim sticks the mex builds in with the main GMT - and these are also in the official installer). Below. xbuild is the top of my build directory with Xcode:

I can think of a few options for this mess:

  1. Include gmtmex.c and gmtmex_parser.[ch] in the main GMT repo since they are C codes that must be compiled. This would let us set a stop point in the gmtmex.c program which is called by gmt.m from matlab.
  2. Only build these if the builder has set some info (MATLAB, other flags) in CmakeConfigAdvanced.cmake
  3. Note sure, but perhaps gmtmex only retains the Matlab *.m files and tests with building happening in the GMT tree.
  4. Perhaps this would simplify the release of the GMT bundle if we included things here as well. As a few of you know, it is hard to get the macOS gmtmex distribution to find the right libraries (e.g., netcdf, hdf) since MATLAB also ship these and when libgmt calls netcdf calling hdf we go boom in the wrong lib which is much older than ours). The share/tools/gmt_prepmex.sh has all the sordid details.

Maybe gmtmex should be a supplement of GMT, @seisman and @joa-quim ?

PaulWessel commented 4 years ago

you even don't need to install pygmt again.

then open a Python console, and type:

import pygmt
pygmt.show_versions()

Must be a missing step:

cd pygmt
git pull [lots of updates]
make install 
pip install --no-deps -e .
Obtaining file:///Users/pwessel/GMTdev/pygmt
Installing collected packages: pygmt
  Attempting uninstall: pygmt
    Found existing installation: pygmt 0.1.2+30.gbe75223
    Uninstalling pygmt-0.1.2+30.gbe75223:
      Successfully uninstalled pygmt-0.1.2+30.gbe75223
  Running setup.py develop for pygmt
Successfully installed pygmt
(base) pwessel@macnut:~/GMTdev/pygmt-> export GMT_LIBRARY_PATH=/Users/pwessel/GMTdev/gmt-dev/xbuild/src/Debug
(base) pwessel@macnut:~/GMTdev/pygmt-> python
Python 3.8.3 (default, May 19 2020, 13:54:14) 
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pygmt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/pwessel/GMTdev/pygmt/pygmt/__init__.py", line 15, in <module>
    from .session_management import begin as _begin, end as _end
  File "/Users/pwessel/GMTdev/pygmt/pygmt/session_management.py", line 4, in <module>
    from .clib import Session
  File "/Users/pwessel/GMTdev/pygmt/pygmt/clib/__init__.py", line 8, in <module>
    from .session import Session
  File "/Users/pwessel/GMTdev/pygmt/pygmt/clib/session.py", line 10, in <module>
    from packaging.version import Version
ModuleNotFoundError: No module named 'packaging'
weiji14 commented 4 years ago

Hmm, can you try pip install packaging (in the command-line) to get the missing dependency? Do the same for other packages if it still complains.

PaulWessel commented 4 years ago

Thanks that works. I forgot I have miniconda so presumably there is a difference in the install vs full conda. Yet, this used to run a month ago...

PaulWessel commented 4 years ago

Can now recreate your case and have it stop in the C in Xcode again. But pushing my bedtime...

weiji14 commented 4 years ago

No hurry, can do it tomorrow :wink:

PaulWessel commented 4 years ago

OK, so quick to figure out why this happens: Some modules read an entire dataset into memory (e.g., gmtconvert) while others read record-by-record (gmtinfo). The dataset-readers are carefully scanning the first record and would figure out what data type is in each column (in the case you did not use -f). Here, there is no such scanning and there is no -f1T passed to gmt info to tell it to expect absolute time in col 1. If I add f="1T" then I get

': N = 5 <10/15> <2020-01-01T00:00:00/2020-01-05T00:00:00>\n'

but also some memory WARNINGS from the lib since I compile with a MEM_DEBUG setting.

That some modules requires -f and others can figure out is an inconsistency we should work on to fix.

joa-quim commented 4 years ago

do you mind posting the usual setup for a newbie to install GMT.jl from the developer version,

You mean the master version?

Installing GMT.jl is pretty simple

  1. Install Julia from https://julialang.org/
  2. In the Julia console type ] add GMT

Note, the ] changes the prompt to the pkg> mode, and the add GMT installs the last registered version. If one wants to go to the developing version we must install the master version. That is achieved with add GMT#master. Just not sure on how to come back to a registered version without removing ] rm GMT and installing again (with no #master).

PaulWessel commented 4 years ago

I think we install julia via brew and port so more likeport install julia I think. I am writing for macOS first. I would only care about master. Developers will only look for bugs in master anyway. Even you probably just use master most of the time. OK, so typing GMT#master. Any need to set path to the Xcode GMT library probably?

PaulWessel commented 4 years ago

@joa-quim I cannot do

julia> push!(Libdl.DL_LOAD_PATH, "$GMT_LIBRARY_PATH")

Can I use a shell variable in this command? It works with copy/paste of the actual path of course.

joa-quim commented 4 years ago

Ah, you are talking about GMT master, not GMT.jl master. GMT.jl doesn't care about that. It uses whatever the system tells gmt --show-library, but for debugging in Mac you need to set the env variable GMT_LIBRARY pointing to the debug lib. Note that's not the LIB_PATH but the dylib itself.

In Julia you can run shell commands using the syntax run(`the command`) but that's not what you need here.

PaulWessel commented 4 years ago

OK, so for Julia I can set things like export GMT_LIBRARY=/Users/pwessel/GMTdev/gmt-dev/xbuild/src/Debug/libgmt.dylib while for PyGMT I must set export GMT_LIBRARY_PATH=/Users/pwessel/GMTdev/gmt-dev/xbuild/src/Debug instead. Would be nice if it was the same, but not big deal.

Yes, I was coming at this from the C side, so we want to use GMT master of course. As or GMT.jl master or not, will need your input on that, but presumably you dont want bug reports for older GMT.jl? So OK to use GMT#master for this?

PaulWessel commented 4 years ago

Also need your thoughts on the MEX. I think we can save many headaches by making gmtmex a GMT supplement.

PaulWessel commented 4 years ago

BTW, both my PyGMT and GMT.jl debug write up now works for me.

joa-quim commented 4 years ago

Would be nice if it was the same, but not big deal.

I think that could be possible if pygmt used the equivalent of string(chop(read(`gmt --show-library`, String))) As a bonus it would probably release the need of setting a GMT_LIBRARY_PATH

Currently GM.jl#master has a ton of improvements. The idea was to release a 1.0 but thay are too many now. Need a longer testing period. And yes, use the #master

Regarding the MEX, yes, making it a supplement is a good idea.

seisman commented 4 years ago

I think that could be possible if pygmt used the equivalent of string(chop(read(`gmt --show-library`, String))) As a bonus it would probably release the need of setting a GMT_LIBRARY_PATH

PyGMT is trying its best to avoid system calls, but gmt --show-library sounds a good idea to me. I'll open a PyGMT issue for more discussions.

PaulWessel commented 4 years ago

So not out of the woods with Julia. I made sure I rebuilt the xcode libgmt. I have the new Xcode 12.0, released a few days ago. I follow the recipe of attaching to julia etc. Running one of Joaquim's examples (using GMT, then a few calcs, then a call to coast) takes me to the stop point I have set in the API in GMT_Call_Module. But there is sits, and I cannot examine any variables nor can I click step to get into the function. So unable to debug. I have done this before with an earlier Xcode so the recipe should work, but maybe 12.0 has a glitch. Yet, the PyGMT debug seemed to work. I do not know what the issue might be though. Update: Works with PyGMT.

joa-quim commented 4 years ago

Funny, since sometime now I notice that in VS when I try to access the call stack it hangs for a while (and processors are working hard doing no-shit), but then after that while things resume to work fine. This happens only if I try to access the stack, otherwise all works fine.

PaulWessel commented 4 years ago

Yes, so gets to break point but the little console window in Xcode that shows the stack variables just has a spinning thingy indicating it is working hard. I will let it sit but this is not just a few seconds - I don't think it will finish. So clicking step in wont work because it is "busy". Notice all the things listed under the Thread in the side-bar. I dont remember all that before.

xcode

PaulWessel commented 4 years ago

So I cannot get out of this spin cycle and thus cannot debug Julia anymore. Not cool.

PaulWessel commented 4 years ago

Will try Xcode 11.7. Only 7.5 Gb zip to download and expand, etc..

PaulWessel commented 4 years ago

Xcode 11.7 works so will make a note in the debug notes.

joa-quim commented 4 years ago

Good, but damn it.

PaulWessel commented 4 years ago

I think the initial issue here got buried in debugging discussion. Hi @weiji14, please confirm that this is still a problem..

weiji14 commented 4 years ago

Yes, still a problem on GMT master. Had to silence the error messages two days ago at https://github.com/GenericMappingTools/pygmt/pull/668 as it was getting annoying.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed if no further activity occurs within 7 days. Thank you for your contributions.

maxrjones commented 3 years ago

@weiji14 the output appears as you wanted it - is this because of workarounds implemented in PyGMT? Would you be able to provide an example that still demonstrates the bug for debugging purposes?

import pygmt
import pandas as pd

table = pd.DataFrame(
    data={
        "z": [10, 13, 12, 15, 14],
        "time": pd.date_range(start="2020-01-01", periods=5),
    }
)
output = pygmt.info(table=table, V="d")
print(output)

#<vector memory>: N = 5 <10/15> <2020-01-01T00:00:00/2020-01-05T00:00:00>
weiji14 commented 3 years ago

Hmm, you might be right! I see that the these two units tests at https://github.com/GenericMappingTools/pygmt/blob/v0.3.1/pygmt/tests/test_info.py#L67-L107 are passing on GMT master (i.e. 6.2.0) now at the GMT Latest tests (https://github.com/GenericMappingTools/pygmt/runs/2104362367?check_suite_focus=true#step:14:324). Not sure when this got fixed :sweat_smile:

Just to double check, could you try this example out (passing in only one datetime column from a numpy array).

import pandas as pd
import pygmt

table = pd.date_range(start="2020-01-01", periods=5).to_numpy()
output = pygmt.info(table=table, per_column=True, V="d")
print(output)
# ['2020-01-01T00:00:00' '2020-01-05T00:00:00']
maxrjones commented 3 years ago

Hmm, you might be right! I see that the these two units tests at https://github.com/GenericMappingTools/pygmt/blob/v0.3.1/pygmt/tests/test_info.py#L67-L107 are passing on GMT master (i.e. 6.2.0) now at the GMT Latest tests (https://github.com/GenericMappingTools/pygmt/runs/2104362367?check_suite_focus=true#step:14:324). Not sure when this got fixed 😅

Just to double check, could you try this example out (passing in only one datetime column from a numpy array).

import pandas as pd
import pygmt

table = pd.date_range(start="2020-01-01", periods=5).to_numpy()
output = pygmt.info(table=table, per_column=True, V="d")
print(output)
# ['2020-01-01T00:00:00' '2020-01-05T00:00:00']

Output looks good, I think:

gmtinfo [INFORMATION]: Processing input table data
gmtinfo [DEBUG]: gmtapi_init_import: Passed family = Data Table and geometry = Point|Line|Poly
gmtinfo [DEBUG]: gmtapi_init_import: Added 1 new sources
gmtinfo [DEBUG]: GMT_Init_IO: Returned first Input object ID = 0
gmtinfo [DEBUG]: gmtapi_init_export: Passed family = Data Table and geometry = Point|Line|Poly
gmtinfo [DEBUG]: Object ID 1 : Registered Data Table File /var/folders/48/71lnqfxj77j8wrw6k0rtklv00000gq/T/pygmt-57pz3p76.txt as an Output resource with geometry Point|Line|Poly [n_objects = 2]
gmtinfo [DEBUG]: gmtapi_init_export: Added 1 new destination
gmtinfo [DEBUG]: GMT_Init_IO: Returned first Output object ID = 1
gmtinfo [DEBUG]: GMT_Begin_IO: Mode value 1 not considered (ignored)
gmtinfo [DEBUG]: GMT_Begin_IO: Initialize record-by-record access for Input
gmtinfo [DEBUG]: gmtapi_next_io_source: Selected object 0
gmtinfo [INFORMATION]: Reading Data Table from Input memory location via vector
gmtinfo [DEBUG]: GMT_Begin_IO: Input resource access is now enabled [record-by-record]
gmtinfo [DEBUG]: GMT_Begin_IO: Initialize record-by-record access for Output
gmtinfo [DEBUG]: gmtapi_next_io_source: Selected object 1
gmtinfo [INFORMATION]: Writing Data Table to file /var/folders/48/71lnqfxj77j8wrw6k0rtklv00000gq/T/pygmt-57pz3p76.txt
gmtinfo [DEBUG]: GMT_Begin_IO: Output resource access is now enabled [record-by-record]
gmtinfo [DEBUG]: GMT_End_IO: Input resource access is now disabled
gmtinfo [DEBUG]: GMT_End_IO: Output resource access is now disabled
==> 2 API Objects at end of GMTAPI_Garbage_Collection entry
--------------------------------------------------------
K.. ID RESOURCE.... FAMILY.... ACTUAL.... DIR... S O M L
--------------------------------------------------------
* 0  0 7fb7aa6283a0 Data Table Vector     Input  2 Y N 0
* 1  1            0 Data Table Data Table Output 2 Y N 1
--------------------------------------------------------
gmtinfo [DEBUG]: gmtlib_unregister_io: Unregistering object no 1 [n_objects = 1]
==> 1 API Objects at end of GMTAPI_Garbage_Collection exit
--------------------------------------------------------
K.. ID RESOURCE.... FAMILY.... ACTUAL.... DIR... S O M L
--------------------------------------------------------
* 0  0 7fb7aa6283a0 Data Table Vector     Input  2 Y N 0
--------------------------------------------------------
['2020-01-01T00:00:00' '2020-01-05T00:00:00']
weiji14 commented 3 years ago

Nice, I'll close this issue then and wait for GMT 6.2.0 to be released. Thanks everyone!

maxrjones commented 3 years ago

Fixed by https://github.com/GenericMappingTools/gmt/pull/4849