natcap / pygeoprocessing

Geoprocessing operations for Python
https://pygeoprocessing.readthedocs.io/en/latest/
Other
78 stars 8 forks source link

handle `numpy` dtypes for scalar values such as in `SetNoDataValue()` #330

Open davemfish opened 1 year ago

davemfish commented 1 year ago

As noted in https://github.com/natcap/invest/issues/992, GDAL does not consider numpy Float32/Float64 objects as valid doubles. And that can lead to exceptions like this:

  File "natcap\invest\coastal_vulnerability.py", line 2216, in warp_and_mask_bathymetry
  File "pygeoprocessing\geoprocessing.py", line 426, in raster_calculator
  File "osgeo\gdal.py", line 3847, in SetNoDataValue
TypeError: in method 'Band_SetNoDataValue', argument 2 of type 'double'

I think it would be nice for pygeoprocessing to handle this instead of forcing the user to remember the details of GDAL and its C implementation. pygeoprocessing encourages users to pair it with numpy in all sorts of ways, so I think we can do a bit better here.

There are a few places where we already handle this in pygeoprocessing. There are 5 instances of calling SetNoDataValue and a few different ways of handling things.

raster_calculator:

value is passed as-is from a function argument

new_raster_from_base: (I propose adopting this solution for all the rest)

uses item() to return a Python scalar representation https://numpy.org/doc/stable/reference/generated/numpy.ndarray.item.html

        try:
            target_band.SetNoDataValue(nodata_value.item())
        except AttributeError:
            target_band.SetNoDataValue(nodata_value)

create_raster_from_bounding_box:

always does band.SetNoDataValue(float(target_nodata)) regardless of the target datatype for the raster

convolve_2d:

value is passed as-is from a function argument unless it is None, then casts to Python float, regardless of target datatype.

numpy_array_to_raster:

value is passed as-is from a function argument

Are there any other cases of numpy Float types being rejected by GDAL, besides SetNoDataValue() that we should consider?

phargogh commented 1 year ago

Are there any other cases of numpy Float types being rejected by GDAL, besides SetNoDataValue() that we should consider?

I just yesterday found that using feature.SetField(name, float_value) where name is a field of ogr.OFTReal expects that float_value is also a double.

In working on that bug (https://github.com/natcap/invest/issues/1350), I also discovered:

Maybe this is consistent with SetNoDataValue? I don't know that I've ever actually tried setting a numpy.float64 object before.