PDAL / python

PDAL's Python Support
Other
116 stars 34 forks source link

python 3.9 / pdal 3.0.2 / MacOSX pdal --drivers --showjson segfault #108

Closed davelajoie closed 2 years ago

davelajoie commented 2 years ago

Hello!

I have random error whenever importing PDAL in my python environment.

>>> import pdal
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/site-packages/PDAL-3.0.2-py3.9-macosx-10.15-x86_64.egg/pdal/__init__.py", line 8, in <module>
    inject_pdal_drivers()
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/site-packages/PDAL-3.0.2-py3.9-macosx-10.15-x86_64.egg/pdal/drivers.py", line 62, in inject_pdal_drivers
    drivers = json.loads(
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 480 column 22 (char 16384)

I can reproduce the problem with this simple python file

import json
import subprocess
drivers = json.loads( subprocess.run(["pdal", "--drivers", "--showjson"], capture_output=True).stdout )

where you run it multiple times, and you will get an error, randomly. Seems to be related to dummy character appearing at the end of the json being streamed with

>pdal --drivers --showjson
    snip....
    },
    {
        "description": "Write data using TileDB.",
        "extensions": [],
        "link": "http://pdal.io/stages/drivers.tiledb.writer.html",
        "name": "writers.tiledb",
        "streamable": true
    }
]%
davelajoie commented 2 years ago

From what I can see the stream is being broken, hence json parsing fails I have printed the

subprocess.run(["pdal", "--drivers", "--showjson"], capture_output=True).stdout

before being processed by

json.loads()
"description": "Provide a hook for a simple point-by-point callback.",\n        "extensions": [],\n        "link": "",\n        "name": "filters.streamcallback",\n        "streamable": true\n    },\n    {\n        "description": "Return N points from end of the point cloud.",\n        "extensions": [],\n        "link": "http://pdal.io/stages/filters.tail.html",\n        "name": "filters.tail",\n        "streamable": false\n    },\n    {\n        "description": "Transform each point using a 4x4 transformation matrix",\n        "extensions": [],\n        "link": "http://pdal.io/stages/filters.transformation.html",\n        "name": "filters.transformation",\n        "streamable": true\n    },\n    {\n        "description": "Voxel Center Nearest Neighbor Filter",\n        "extensions":'
 Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/site-packages/PDAL-3.0.2-py3.9-macosx-10.15-x86_64.egg/pdal/__init__.py", line 8, in <module>
    inject_pdal_drivers()
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/site-packages/PDAL-3.0.2-py3.9-macosx-10.15-x86_64.egg/pdal/drivers.py", line 67, in inject_pdal_drivers
    drivers = json.loads( driversStr )
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 480 column 22 (char 16384)
davelajoie commented 2 years ago

I have updated the pdal.drivers.inject_pdal_drivers() to use the following code

driversStr = subprocess.check_output( [ "pdal", "--drivers", "--showjson" ] )
drivers = json.loads( driversStr.decode('utf-8') )

and I got SIGSEGV

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/site-packages/PDAL-3.0.2-py3.9-macosx-10.15-x86_64.egg/pdal/__init__.py", line 8, in <module>
    inject_pdal_drivers()
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/site-packages/PDAL-3.0.2-py3.9-macosx-10.15-x86_64.egg/pdal/drivers.py", line 63, in inject_pdal_drivers
    driversStr = subprocess.check_output( [ "pdal", "--drivers", "--showjson" ] )
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/Users/dave.lajoie/Miniconda3/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['pdal', '--drivers', '--showjson']' died with <Signals.SIGSEGV: 11>.
> pdal --drivers --showjson
zsh: segmentation fault  pdal --drivers --showjson
davelajoie commented 2 years ago

Here is my current hacked workaround...

for _ in range( 5 ):
    try:
        import pdal
        break
    except:
        pass
hobu commented 2 years ago

I have seen the extra % character getting printed in --showjson output on only OSX from time to time as well. I have not been able to track down what combination of shell and configuration settings cause it, however.

You can always set your pin back to an older version of the PDAL bindings if you need to get things going. A newer version of the PDAL bindings will not be calling out to shell and will instead build up the drivers/options dictionaries directly, so this issue should go away with that update in the next version.

davelajoie commented 2 years ago

Tx Howard! I don't think the % is the issue. I don't get it if I use subprocess.check_output(). I think the SEGV is more concerning

hobu commented 2 years ago

I think the SEGV is more concerning

Yes, I've been having that one too. I get a stack trace that looks like this in RelWithDebugInfo builds. Can you confirm?

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x13c12c8a0)
  * frame #0: 0x000000013c12c8a0 libc++.1.0.dylib`std::__1::moneypunct<char, false>::do_positive_sign() const
    frame #1: 0x00000001c3403f84 libc++.1.dylib`std::__1::basic_ostream<char, std::__1::char_traits<char> >::operator<<(unsigned long long) + 400
    frame #2: 0x000000014cd40010 libpdalcpp.13.0.0.dylib`pdal::LasWriter::finishOutput(this=0x0000000101023600) at LasWriter.cpp:1151:45
    frame #3: 0x000000014cd3fdf8 libpdalcpp.13.0.0.dylib`pdal::LasWriter::doneFile(this=0x0000000101023600) at LasWriter.cpp:1136:5
    frame #4: 0x000000014cb9e50c libpdalcpp.13.0.0.dylib`pdal::FlexWriter::done(this=0x0000000101023600, table=0x0000600003b017a0) at FlexWriter.hpp:142:13
davelajoie commented 2 years ago

sure, I am not building my own pdal, I am using conda/mamba to setup my environment. do you have a dylib I could try? or an env var?

hobu commented 2 years ago

that's ok. I'm pretty sure it's the same thing. Our suspicion is it is an SDK behavior difference in OSX, or maybe we're mucking up the memory management of the LogPtr in the Python bindings when stuff cleans up. We haven't had time to run it down.

davelajoie commented 2 years ago

Tx for the update. let me know how I can help. I can create a setup to test PR and validate if needs be.

hobu commented 2 years ago

Can you pin back to an older Python bindings for now? I don't know when we might get to this.

abellgithub commented 2 years ago

Perhaps the base error is because PDAL doesn't flush standard out when it exits? This isn't normally necessary, but perhaps this an ordering thing. stdout gets flushed at exit, but perhaps the pipe that Python is using is closed before stdout so the output data is lost.

I have no clue about the crash and think I tried to recreate and failed. It appears that the log is invalid at the time a write occurs. Are you running with --debug? Because it looks like the crash is a log line that only happens when --debug is on.

davelajoie commented 2 years ago

looks like it, when it segfaults, the last character is missing ] here is the output with --debug, whenever it is crashing PDAL_python_issue_108.txt

abellgithub commented 2 years ago

Thinking more, I can't see how Python wouldn't get the full contents of the output to stderr unless maybe the pipe that was used as FD 1 was full and PDAL isn't waiting around to exit? That kind of makes sense. @hobu is right that doing something other than capturing standard out would be more robust.

abellgithub commented 2 years ago

Does it crash without --debug?

davelajoie commented 2 years ago

crash with and without --debug

rdesparbes commented 2 years ago

Hello! It seems that I encountered a similar behavior on Windows 10:

Traceback (most recent call last):
  File "C:\Users\despa\Desktop\test.py", line 1, in <module>
    import pdal
  File "C:\Users\despa\miniconda3\envs\test\lib\site-packages\pdal\__init__.py", line 8, in <module>
    inject_pdal_drivers()
  File "C:\Users\despa\miniconda3\envs\test\lib\site-packages\pdal\drivers.py", line 77, in inject_pdal_drivers
    drivers = json.loads(output)
  File "C:\Users\despa\miniconda3\envs\test\lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "C:\Users\despa\miniconda3\envs\test\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\despa\miniconda3\envs\test\lib\json\decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 760 column 11 (char 24576)
hobu commented 2 years ago

master has the patch for this if you can install from it.

BC14w commented 2 years ago

I was not succesful using master branch for this. once I run setup.py, and try to import pdal again, I get this error: ImportError: cannot import name 'libpdalpython' from partially initialized module 'pdal' (most likely due to a circular import) Has anyone run into this issue too?

hobu commented 2 years ago

pdal 3.1.2 Python bindings has a fix for this and no longer calls out to the pdal command. Please reopen if issues still persist.

davelajoie commented 2 years ago

Tx!