corteva / rioxarray

geospatial xarray extension powered by rasterio
https://corteva.github.io/rioxarray
Other
507 stars 80 forks source link

`_NODATA_DTYPE_MAP` causing bug in `rio.reproject()` #618

Closed caldwellst closed 1 year ago

caldwellst commented 1 year ago

Example code

rio.reproject generates errors when the input data array is int64.

import xarray as xr
import rioxarray
from rasterio.enums import Resampling

da = xr.DataArray(
        [[1, 2, 3], [4, 5, 6]],
        dims=("y", "x"),
        coords={"y": [1.5, 0.5], "x": [0.5, 1.5, 2.5]},
    ).rio.write_crs("EPSG:4326").astype("int64")

# da.rio.set_nodata(-99, inplace=True) solves the issue

da.rio.reproject(
    da.rio.crs,
    shape = (2 * da.rio.height, 2 * da.rio.width),
    resampling = Resampling.nearest
)

Cause

rasterio.dtypes.dtype_rev now returns keys 12 and 13 for int64 and uint64 respectively, and 14 for int8 (if GDAL is above a certain version). However, the above example generates a key error because _NODATA_DTYPE_MAP needs to be updated to include these new types, and currently only has keys to 1 to 11. The version of the GDAL reference table originally used has been updated with defaults for these new types (although they are all None).

Note that there is another difference between the newer reference table and what's in _NODATA_DTYPE_MAP. The cint and cfloat default values are now None.

Proposal

Easiest proposal would be to simply update _NODATA_DTYPE_MAP to include 12, 13, and 14, and adjust cint and cfloat. It would now be (followed CONTRIBUTING.rst, but don't have permission to push and make PR):

_NODATA_DTYPE_MAP = {
    1: 255,  # GDT_Byte
    2: 65535,  # GDT_UInt16
    3: -32768,  # GDT_Int16
    4: 4294967293,  # GDT_UInt32
    5: -2147483647,  # GDT_Int32
    6: 3.402823466e38,  # GDT_Float32
    7: 1.7976931348623158e308,  # GDT_Float64
    8: None,  # GDT_CInt16
    9: None,  # GDT_CInt32
    10: 3.402823466e38,  # GDT_CFloat32
    11: 1.7976931348623158e308,  # GDT_CFloat64
    12: None,  # GDT_Int64
    13: None,  # GDT_UInt64
    14: None,  # GDT_Int8
}

However, I am unaware why this is necessary (although not an expert in these things). If the existing nodata value is None, why only put to default during reprojection? Maybe worth considering dropping this to prevent needing to manually align again in the future?

snowman2 commented 1 year ago

don't have permission to push and make PR

To make a PR, fork this repository, push the changes to your fork, and then make a pull request.

For more details: https://docs.github.com/en/get-started/quickstart/contributing-to-projects