corteva / rioxarray

geospatial xarray extension powered by rasterio
https://corteva.github.io/rioxarray
Other
511 stars 81 forks source link

to_raster() save function changes original data #574

Closed arojas314 closed 1 year ago

arojas314 commented 2 years ago

to_raster() function modifying original variable values on output

I added a reproducible script below and attached a link to the data.

Code Sample

import rioxarray
import xarray as xr
import numpy as np
from pyproj import CRS

# Open data
fpath = "./2020/002/00/OR_ABI-L2-LSTM1-M6_G16_s20200020000291_e20200020000349_c20200020001006.nc"
ds_xarr = xr.open_dataset(fpath, decode_coords="all")

# Lets convert coords to latitude and longitude
cc = CRS.from_cf(ds_xarr.goes_imager_projection.attrs)
ds_xarr.rio.write_crs(cc.to_string(), inplace=True)
sat_height = ds_xarr.goes_imager_projection.attrs["perspective_point_height"]
ds_xarr['x'] = ds_xarr.x.values * sat_height
ds_xarr['y'] = ds_xarr.y.values * sat_height
ds_xarr4326 = ds_xarr.rio.reproject("epsg:4326")

# lets view the min and max values
print("original min value: ", np.nanmin(ds_xarr4326['LST'].data)) # 255.795
print("original max value:" , np.nanmax(ds_xarr4326['LST'].data)) # 287.8375

# Save as raster file
ds_xarr4326['LST'].rio.to_raster("./GOES16_LST.tif",driver="GTiff")

# Lets read in output file and view the min max values (why are these different than original data?)
outds = xr.open_dataset("./GOES16_LST.tif")
print("outds min value: ", np.nanmin(outds['band_data'].data)) # 108.08
print("outds maxvalue: ", np.nanmax(outds['band_data'].data)) # 271.91748

test data location: https://github.com/arojas314/data-sharing/blob/main/OR_ABI-L2-LSTM1-M6_G16_s20200020000291_e20200020000349_c20200020001006.nc

Problem description

When saving the xarray as a raster (GeoTiff file), the original data values are modified. See output for min and max values above for more insight. Originally, min max values were [255.795, 287.8375]. After saving to .tif file, new min max values are [108.08, 271.91748].

Expected Output

Environment Information

Installation method

Conda environment information (if you installed with conda):


Environment (conda list):

![image](https://user-images.githubusercontent.com/57734585/188782225-288c21e5-a081-4337-9f25-301fe17de82e.png)
snowman2 commented 2 years ago

nodata, scale & offset values in encoding:

ds_xarr4326.LST.encoding
 '_Unsigned': 'true',
 '_FillValue': 65535,
 'scale_factor': 0.0025,
 'add_offset': 190.0,
outds.band_data.encoding
 'scale_factor': 0.0024999999441206455,
 'add_offset': 190.0,
 '_FillValue': -1.0,
snowman2 commented 1 year ago

With #575

import rioxarray
import xarray
import numpy

fpath = "OR_ABI-L2-LSTM1-M6_G16_s20200020000291_e20200020000349_c20200020001006.nc"

xds = xarray.open_dataset(fpath, decode_coords="all")

print("original min value: ", numpy.nanmin(xds['LST'])) # 255.795
print("original max value:" , numpy.nanmax(xds['LST'])) # 287.8375

rds = rioxarray.open_rasterio(fpath, mask_and_scale=True, variables="LST")

print("rio min value: ", numpy.nanmin(xds['LST'])) # 255.795
print("rio max value: ", numpy.nanmax(xds['LST'])) # 287.8375

sat_height = xds.goes_imager_projection.attrs["perspective_point_height"]
xds['x'] = xds.x.values * sat_height
xds['y'] = xds.y.values * sat_height
ds_xarr4326 = xds.rio.reproject("epsg:4326")

# Save as raster file
ds_xarr4326['LST'].rio.to_raster("./GOES16_LST.tif",driver="GTiff")

outds = xarray.open_dataset("./GOES16_LST.tif")
print("outds min value: ", numpy.nanmin(outds['band_data'])) # 255.795
print("outds max value: ", numpy.nanmax(outds['band_data'])) # 287.8375