opendatacube / eo-datasets

Easily write, validate and convert EO datasets and metadata.
Apache License 2.0
50 stars 26 forks source link

Important fix proposal: Implement setting "nan" nodata value for floating point rasters with no nodata attribute #351

Closed robbibt closed 4 months ago

robbibt commented 5 months ago

Hi all, as discussed internally within DEA, we have recently encountered an issue where floating point rasters are not displayed correctly in ESRI software due to the lack of an explicitly set nodata attribute of "nan". This differs from open software such as QGIS and GDAL, which correctly interpret nodata by implicitly treating a missing nodata attribute as equivalent to "nan".

While this issue has been resolved by patching previously generated data on prod S3, a more sustainable long-term fix would be to set nodata attributes in our GeoTIFF writing code, to ensure that any floating point raster without a custom nodata attribute is assigned a nodata attribute of "nan".

EO datasets is an important place to fix this, as it is used to generate derivative DEA products that often include floating point rasters. A possible fix would likely involve a minor change similar to psuedocode below:

if (raster is floating point dtype) AND (no nodata attribute set):
    raster.nodata = "nan"

Testing within DEA has revealed no downsteam impacts of this fix, while successfully solve the issue for ESRI users (a large proportion of our user base).

robbibt commented 5 months ago

From what I can tell, a possible fix for GeoTIFF writing in eodatasets would occur somewhere here: https://github.com/opendatacube/eo-datasets/blob/develop/eodatasets3/images.py#L562-L789