corteva / rioxarray

geospatial xarray extension powered by rasterio
https://corteva.github.io/rioxarray
Other
507 stars 80 forks source link

ValueError: cannot convert float NaN to integer on `open_rasterio` #666

Closed JosefWN closed 1 year ago

JosefWN commented 1 year ago

Code Sample, a copy-pastable example if possible

import rioxarray as rx
import xarray as xr

with xr.open_rasterio('https://seaice.uni-bremen.de/data/amsr2/asi_daygrid_swath/n3125/2021/oct/Arctic3125/asi-AMSR2-n3125-20211001-v5.4.tif') as img:
  print(img)

print ('----')

with rx.open_rasterio('https://seaice.uni-bremen.de/data/amsr2/asi_daygrid_swath/n3125/2021/oct/Arctic3125/asi-AMSR2-n3125-20211001-v5.4.tif') as img:
  print(img)

Problem description

/Users/xyz/Desktop/test.py:4: DeprecationWarning: open_rasterio is Deprecated in favor of rioxarray. For information about transitioning, see: https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html
  with xr.open_rasterio('https://seaice.uni-bremen.de/data/amsr2/asi_daygrid_swath/n3125/2021/oct/Arctic3125/asi-AMSR2-n3125-20211001-v5.4.tif') as img:
<xarray.DataArray (band: 1, y: 3584, x: 2432)>
[8716288 values with dtype=uint8]
Coordinates:
  * band     (band) int64 1
  * y        (y) float64 5.848e+06 5.845e+06 5.842e+06 ... -5.345e+06 -5.348e+06
  * x        (x) float64 -3.848e+06 -3.845e+06 ... 3.745e+06 3.748e+06
Attributes: (12/18)
    transform:                 (3125.0, 0.0, -3850000.0, 0.0, -3125.0, 585000...
    crs:                       +init=epsg:3413
    res:                       (3125.0, 3125.0)
    is_tiled:                  0
    nodatavals:                (nan,)
    scales:                    (1.0,)
    ...                        ...
    x#long_name:               x
    y#actual_range:            {0.5,3583.5}
    y#long_name:               y
    z#actual_range:            {0,120}
    z#long_name:               z
    z#_FillValue:              nan
----
Traceback (most recent call last):
  File "/Users/xyz/Desktop/test.py", line 9, in <module>
    with rx.open_rasterio('https://seaice.uni-bremen.de/data/amsr2/asi_daygrid_swath/n3125/2021/oct/Arctic3125/asi-AMSR2-n3125-20211001-v5.4.tif') as img:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/rioxarray/_io.py", line 1241, in open_rasterio
    result.attrs["_FillValue"] = result.dtype.type(result.attrs["_FillValue"])
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: cannot convert float NaN to integer

Expected Output

Would expect something in line with xarray.open_rasterio rather than crashing.

Environment Information

rioxarray (0.14.0) deps:
  rasterio: 1.3.6
    xarray: 2023.3.0
      GDAL: 3.5.3
      GEOS: 0.0.0
      PROJ: 9.0.1
 PROJ DATA: /opt/homebrew/lib/python3.11/site-packages/rasterio/proj_data
 GDAL DATA: /opt/homebrew/lib/python3.11/site-packages/rasterio/gdal_data

Other python deps:
     scipy: 1.10.0
    pyproj: 3.5.0

System:
    python: 3.11.3 (main, Apr  7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)]
executable: /opt/homebrew/opt/python@3.11/bin/python3.11
   machine: macOS-13.3.1-arm64-arm-64bit

Installation method

Homebrew

Kirill888 commented 1 year ago

TIFF image contains netcdf attributes of the original data, which had _FillValue=nan, but image itself is rgba (using Palette)

Band 1 Block=2432x3 Type=Byte, ColorInterp=Palette
  NoData Value=nan
  Metadata:
    actual_range={0,120}
    long_name=z
    NETCDF_VARNAME=z
    _FillValue=nan
  Color Table (RGB with 256 entries)

When coercing _FillValue attribute to the actual output dtype of uint8 things break:

https://github.com/corteva/rioxarray/blob/4c0aef19ad0c4ff81d94395eb62b605738a7b26c/rioxarray/_io.py#L1239-L1241

I would put a try:except guard on that conversion and leave attribute unchanged if casting fails for whatever reason, maybe with a warning, maybe silent.

snowman2 commented 1 year ago

This is an unexpected scenario. Sounds like you have an invalid file.

I am leaning towards addressing this issue by removing the _FillValue if riods.nodata is None here: https://github.com/corteva/rioxarray/blob/4c0aef19ad0c4ff81d94395eb62b605738a7b26c/rioxarray/_io.py#L657-L659

snowman2 commented 1 year ago
if riods.nodata is not None: 
    # The nodata values for the raster bands 
    attrs["_FillValue"] = riods.nodata 
else:
    attrs.pop("_FillValue", None)
snowman2 commented 1 year ago

See #667