corteva / rioxarray

geospatial xarray extension powered by rasterio
https://corteva.github.io/rioxarray
Other
504 stars 80 forks source link

Check data type and/or better error msg for `rioxarray.clip` #717

Open szwiep opened 7 months ago

szwiep commented 7 months ago

Description

When trying to clip an int32 DataArray object x using the x.rio.clip() method, the following error is thrown:

python3.12/site-packages/xarray/core/duck_array_ops.py:201: RuntimeWarning: invalid value encountered in cast
  return data.astype(dtype, **kwargs)

The same call works fine with float32 data, so I'm assuming this is because rioxarray returns NaN for any non-intersecting cells and xarray can't cast those into int32. If my assumptions is correct, and if int dtypes shouldn't be supported for clip, then I think it could be beneficial to check whether self._obj.dtype is NaN-friendly and follow-up with a useful error message if it's not. Something along the lines of:

ValueError: int32 data is not supported for clip, data type must be float.

Example

import numpy as np
import geopandas as gpd
import xarray as xr
import rioxarray

x = np.arange(2)
idata = np.random.randint(0, 3, (2, 2))
ida = xr.DataArray(data=idata,
                   dims=['y', 'x'],
                   coords={'x': x,
                           'y': x})
ida.rio.write_crs('4326', inplace=True)

fdata = idata.astype(float)
fda = xr.DataArray(data=fdata,
                   dims=['y', 'x'],
                   coords={'x': x,
                           'y': x})
fda.rio.write_crs('4326', inplace=True)

geometries = [
    {
        'type': 'Polygon',
        'coordinates': [[
            [0, 0],
            [0, 1],
            [1, 1],
            [1, 0],
            [0, 0]
        ]]
    }
]
ida.rio.clip(geometries, drop=False)
# > ...\xarray\core\duck_array_ops.py:191: RuntimeWarning: invalid value encountered in cast
# >      return data.astype(dtype, **kwargs)
fda.rio.clip(geometries, drop=False)
# > <xarray.DataArray (y: 2, x: 2)>
# > array([[nan,  1.],
# >        [nan, nan]])
# > Coordinates:
# >   * x            (x) int32 0 1
# >   * y            (y) int32 0 1
# >     spatial_ref  int32 0