PDAL / python

PDAL's Python Support
Other
117 stars 35 forks source link

Dataframe dtypes are changed after including a `filters.assign` to the pipeline #174

Closed kylemann16 closed 3 weeks ago

kylemann16 commented 2 months ago

What's happening: PDAL python library seems to be adjusting data types when using filters.assign. In the example below, NumberOfReturns and ReturnNumber are both changed from uint8 to float64 after being manipulated by filters.assign

What's expected: Datatypes shouldn't change unless explicitly stated somewhere.

I haven't looked into this too much so far, so I'm not sure how deep this goes. I tried adding a writers.copc stage at the end and did a pdal info from there and data types were as expected (uint8), but it's possible that PDAL is manipulating these dimensions and then casting them back when writing out.

Minimal Example:

import pdal
url = 'https://github.com/PDAL/data/raw/master/autzen/autzen-classified.copc.laz'

def just_reader():
    reader = pdal.Reader(filename=url)
    p = reader.pipeline()
    p.execute()
    df = p.get_dataframe(0)
    print('Just the reader: ')
    print(df[['ReturnNumber','NumberOfReturns']].dtypes)

def rn_nor_added():
    reader = pdal.Reader(filename=url)
    rn = pdal.Filter.assign(value="ReturnNumber = 1 WHERE ReturnNumber < 1")
    nor = pdal.Filter.assign(value="NumberOfReturns = 1 WHERE NumberOfReturns < 1")
    p = reader | rn | nor
    p.execute()
    df = p.get_dataframe(0)
    print('With assign filters: ')
    print(df[['ReturnNumber','NumberOfReturns']].dtypes)

just_reader()
rn_nor_added()
kylemann16 commented 2 months ago

logs:

Just the reader: 
ReturnNumber       uint8
NumberOfReturns    uint8
dtype: object
With assign filters: 
ReturnNumber       float64
NumberOfReturns    float64
dtype: object
abellgithub commented 2 months ago

This is a bug in the base PDAL code. I'll try to fix tomorrow.