PDAL / python

PDAL's Python Support
Other
115 stars 34 forks source link

PDAL python should protect its pipeline object from manipulation in PDAL #163

Open kylemann16 opened 5 months ago

kylemann16 commented 5 months ago

PDAL's readers.stac (and readers.tindex) manipulate the incoming pipeline to allow for adding readers as inputs while processing. This means that the incoming pipeline will be changed from what the user inputs when using python, and makes it impossible to run an execute on the same pipeline twice and get the same result.

Expected result: Created pipeline object should always match what the user inputs Actual result: Front facing pipeline doesn't change, but the underlying pipeline object that's being operated on does change.

Bug illustrated here: https://gist.github.com/kylemann16/ab92ba35f45c6b59420fabedc61591f8

hobu commented 4 months ago

Expected result: Created pipeline object should always match what the user inputs

This has never been how PDAL works. The pipeline is dynamic until initialize is called, and stages are allowed to modify the pipeline, including inserting additional stages, setting their options, adjusting metadata up until that point. The Python bindings are a mirror of the stages and their relationships, and they are handed off to PDAL as JSON. The JSON that PDAL might give back to Python is not necessarily going to be the same.

This means that the incoming pipeline will be changed from what the user inputs when using python, and makes it impossible to run an execute on the same pipeline twice and get the same result.

I'm not sure what you mean here. If readers.stac is not providing the same results with repeated runs from pipeline or Python, that's a bug in readers.stac, not in PDAL or the PDAL Python bindings.