corteva / rioxarray

geospatial xarray extension powered by rasterio
https://corteva.github.io/rioxarray
Other
511 stars 81 forks source link

rasterio DatasetReader complex_int16 incompatible with auto chunking of dask (rioxarray.open_rasterio) #542

Closed cdubos-fr closed 2 years ago

cdubos-fr commented 2 years ago

Code Sample, a copy-pastable example if possible

import numpy as np
import rasterio
import rioxarray

filename = "complex_int16.tiff"

with rasterio.open(filename, mode='w', width=16, height=16, count=1, dtype="complex_int16") as writer:
    writer.write(np.array([[[ i*j for j in range(16)] for i in range(16)]]))

rioxarray.open_rasterio(filename, chunks=True)

Problem description

When opening a file with rioxarray.open_rasterio passing chunks=True, rasterio.DatasetReader can have "complex_int16" dtype, that is incompatible with dask.core.normalize_chunk dtype parameter (line 682 in rioxarray._io) and cause TypeError: data type 'complex_int16' not understood.

rasterio manage complex_int16 when reading data with rasterio.dtypes._getnpdtype and rasterio.dtypes._is_complex_int to translate internal complex_int16 to np.complex64 .

Expected Output

rioxarray.open_rasterio should work and provide a xarray.DataArray with np.complex64 dtype in the case of processing a rasterio file with complex_int16 dtype.

Environment Information

Installation method

snowman2 commented 2 years ago

This is likely the line causing troubles: https://github.com/corteva/rioxarray/blob/570150fbb0d359c5567a08a71adb5efc908164f0/rioxarray/_io.py#L683

I believe this can be fixed by wrapping with this function: https://github.com/corteva/rioxarray/blob/570150fbb0d359c5567a08a71adb5efc908164f0/rioxarray/_io.py#L339-L347

A PR with the fix is welcome.