PDAL / python

PDAL's Python Support
Other
116 stars 34 forks source link

How to pass SRS and other metadata from one pipeline to another #112

Closed digital-idiot closed 1 year ago

digital-idiot commented 2 years ago

I am trying to read a georeferenced point cloud, manually perform some processing on the array form then write the processed point cloud to file. My code roughly looks like the following:

import_template = [
    {
        "type": "readers.las",
        "filename": str(file_path)
    }
]
import_pipeline = pdal.Pipeline(json.dumps(import_template))
import_pipeline.execute()
pc_arr = import_pipeline.get_arrays()[0]

# Here I do the processing (this does not change the shape or dtype of pc_arr)

# I want to export the processed array with same metadata attributes such as SRS etc
export_template = [
    {
        "filename": str(out_path),
        "type": "writers.las"
        "out_srs": <a_srs>
    }
]

I could not find any clue in the documentation how to achieve such thing. I tried to assign the metadata from import_pipeline to export_pipeline.metadata but it seems Pipeline.metadata is not modifiable this way as it results into AttribueError.

abellgithub commented 2 years ago

Hi Abhisek,

I'm afraid there's no easy way to get the SRS with the python interface, though it is available from the pipeline as part of the metadata. You can call getMetadata() on the pipeline to return a string which is the pipeline metadata, structured as JSON. You should be able to extract the SRS, which is stored as WKT.

Hope that helps,

On Tue, Mar 1, 2022 at 3:27 PM Abhisek Maiti @.***> wrote:

I am trying to read a georeferenced point cloud, manually perform some processing on the array form then write the processed point cloud to file. My code roughly looks like the following:

import_template = [ { "type": "readers.las", "filename": str(file_path) } ]import_pipeline = pdal.Pipeline(json.dumps(import_template))import_pipeline.execute()pc_arr = import_pipeline.get_arrays()[0]

Here I do the processing (this does not change the shape or dtype of pc_arr)

I want to export the processed array with same data such as SRS etcexport_template = [

{
    "filename": str(out_path),
    "type": "writers.las"
    "out_srs": <a_srs>
}

]

I could not find any clue in the documentation how to achieve such thing. I tried to assign the metadata from import_pipeline to export_pipeline.metadata but it seems Pipeline.metadata is not modifiable this way as it results into AttribueError.

— Reply to this email directly, view it on GitHub https://github.com/PDAL/python/issues/112, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKBMMHXI2FG4HHLAJZKDLDU5Z4SDANCNFSM5PVCNNTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Andrew Bell @.***

abellgithub commented 2 years ago

I should add that your code can probably be restructured to be a single pipeline if you like. The mystery bit in the middle can be done with filters.python.

hobu commented 2 years ago

I'll add that in the PDAL Python master branch, I have added support for QuickInfo https://github.com/PDAL/python/pull/109 which should allow you to get the SRS and other metadata information about LAS files without doing a full read of the data. This hasn't yet been released (I am waiting until PDAL 2.4.0 is released before issuing the PDAL Python bindings release), but once it is, you might find it useful for this task.

digital-idiot commented 2 years ago

I should add that your code can probably be restructured to be a single pipeline if you like. The mystery bit in the middle can be done with filters.python.

Hi, @abellgithub Yes I am aware that I can pass functions to filters in the pipeline and I can reduce the whole thing to a single pipeline. However, in my case, the processing part is a bit complicated, point cloud array is passed to a deep learning model in GPU, I am not sure whether all this can be managed through filters so I decided to implement it this way.

You should be able to extract the SRS, which is stored as WKT.

I noticed in the metadata there are several WKT strings. For example, following keys appears to be containing WKT strings:

There are also prettycompoundwkt and prettywkt which I believe are just the formatted version of corresponding non pretty counterpart. In my case srs -> vertical was empty, probably because there was no vertical CRS set in this particular case. What I observed is that all of these WKT strings refers to the same CRS. I tested this using pyproj.

import json
from pyproj.crs import CRS
m = json.loads(pipeline.get_metadata())
crs1 = CRS.from_wkt(m['metadata']['readers.las']['srs']['wkt'])
...

I understand it is specific to a particular file but I am a bit confused why there are so many WKT strings in the metadata. If I simply wanted the SRS which one should I use?

abellgithub commented 2 years ago

We provide the SRS broken up as a convenience so a user doesn't have to feed the text through another application to extract the various bits. wkt is there for historical reasons -- it's always the same as horizontal. If you just want to pass through whatever was on the original source, use compoundwkt.

hobu commented 1 year ago

f635e955fb5e53bc667b2506f923b5f508337301 adds a srswkt2 property to fetch that from a pipeline. You can pass that into pyproj to get a projection object you can do stuff with.