tmontaigu / pylas

⚠️ pylas was merged into laspy 2.0 https://github.com/laspy/laspy⚠️
BSD 3-Clause "New" or "Revised" License
39 stars 13 forks source link

Regression: SubFieldView does not support some numpy operations #29

Closed rdesparbes closed 3 years ago

rdesparbes commented 3 years ago

In pylas==0.4.3 for example, it was possible to concatenate fields and to apply several numpy operations to the fields of the point clouds. Unfortunately, some operations are not possible anymore in pylas==0.5.0a1, probably because of the new SubFieldView class. Here are some examples of operations that crash with pylas==0.5.0a1 but worked perfectly with pylas==0.4.3:

>>> import pylas
>>> import numpy as np
>>> cloud = pylas.read("pylastests/simple.las")
>>> np.concatenate([cloud.return_number, cloud.return_number])
...
ValueError: zero-dimensional arrays cannot be concatenated
>>> cloud.return_number[0] + 1
...
TypeError: unsupported operand type(s) for +: 'SubFieldView' and 'int'
>>> field = np.zeros(len(cloud.points), dtype=np.uint8)
>>> field[:] = cloud.return_number[:]
...
TypeError: __array__() takes 1 positional argument but 2 were given
>>> ordered = np.argsort(cloud.gps_time)
>>> cloud.return_number[:] = cloud.return_number[ordered]
...
TypeError: unsupported operand type(s) for <<: 'SubFieldView' and 'int'
tmontaigu commented 3 years ago

The SubFieldView class was introduced to fix an inconsistency between 'subfields' (fields that are bit fields and are stored on less than a byte, such as return_number) and other regular fields (eg: gps_time, user_data, point_source_id)

In pylas < 0.5 when accessing a sub field las.return_number a copy would be returned, meaning that modifications whould not propagate.


import pylas
las = pylas.read("pylastests/simple.las")
import numpy as np

print(las.return_number)
# array([1, 1, 1, ..., 1, 1, 1], dtype=uint8)

ascending_order = np.argsort(las.return_number)[::-1]
print(las.return_number[ascending_order])
# array([4, 4, 4, ..., 1, 1, 1], dtype=uint8)
las.return_number[:] = las.return_number[ascending_order]
print(las.return_number)
# array([1, 1, 1, ..., 1, 1, 1], dtype=uint8) # bif oof

# To actually update you have to do
rn = las.return_number[ascending_order]
las.return_number = rn
print(las.return_number)
# array([4, 4, 4, ..., 1, 1, 1], dtype=uint8)

Whereas with pylas >= 0.5 the same script would have a more consistent behaviour.

The SubFieldView tries to behave has much as possible as a np.ndarray, however there may be things that cannot be possible to immitate.

As for errors you mentionned:

>>> np.concatenate([cloud.return_number, cloud.return_number])
...
ValueError: zero-dimensional arrays cannot be concatenated

^ Should be fixable

>>> cloud.return_number[0] + 1
...
TypeError: unsupported operand type(s) for +: 'SubFieldView' and 'int'

^ should be fixable

>>> ordered = np.argsort(cloud.gps_time)
>>> cloud.return_number[:] = cloud.return_number[ordered]
...
TypeError: unsupported operand type(s) for <<: 'SubFieldView' and 'int'

^ Is already fixed on master

>>> field = np.zeros(len(cloud.points), dtype=np.uint8)
>>> field[:] = cloud.return_number[:]
...
TypeError: __array__() takes 1 positional argument but 2 were given

^ I don't think it is fixable. however you can copy the SubFieldView into a proper numpy array

field = np.array(cloud.return_number)
print(field)
# [4 4 4 ... 1 1 1]
# if field is modified:
las.return_number[:] = field[:]