locationtech / geotrellis

GeoTrellis is a geographic data processing engine for high performance applications.
http://geotrellis.io
Other
1.34k stars 360 forks source link

Int32 cells interpreted as float32 cells #3377

Open thomas-maschler opened 3 years ago

thomas-maschler commented 3 years ago

Describe the bug

When trying to read geotiffs saved as Uint32 using GdalRasterSource, the tile is interpreted as float32 and all data come back as no data.

To Reproduce

Provide as able:

Presigned link to input file (expires April 15, happy to update link if required). This file is saved using datatype=UInt32 and nbits=17

https://gfw-data-lake.s3.amazonaws.com/idn_forest_area/v201709/raster/epsg-4326/10/40000/type/gdal-geotiff/10N_110E.tif?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAV3FRM4Z6FWH7NCUZ/20210408/us-east-1/s3/aws4_request&X-Amz-Date=20210408T202851Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=e5dfee2ab0a845b833a0411c4062d3a2c06f7c50095241d2595d9289940ba2c4

Same file but using datatype=UInt32 and nbits=32 (same result)

https://gfw-data-lake.s3.amazonaws.com/idn_forest_area/v201709/raster/epsg-4326/10/40000/type/geotiff/10N_110E.tif?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAV3FRM4Z6FWH7NCUZ/20210408/us-east-1/s3/aws4_request&X-Amz-Date=20210408T203058Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=763389297b6cf4574f86610a86903a379cf6a3cc186413f425eb720a81857908

Uint32 tile with correct pixel values

Expected behavior

Return Uint32 tile with correct pixel values

Screenshots

Annotated screenshot for relevant code section with debugger on. https://wri-users.s3.amazonaws.com/tmaschler/geotrellis/screenshots/uint32_debug.png

Environment

pomadchin commented 3 years ago

Hey @thomas-maschler, I would not say it is a bug, it is a GeoTrellis feature. It is not specific to GDAL or pure Java Readers, they both behave consistently the same way and that is an expected type for GeoTrellis.

See GeoTiffInfo and GDALUtils

For historical reasons we econde UInt32 TIFFs as Float32 ArrayTiles.

Theoretically we could back UInt32 TIFFs by Longs, but what that would really mean: to add lots of Long methods (for all .get calls, map, foreach, combine, etc operations) in addition to the existing Int and Double (get, getDouble, map, mapDouble, combine, combineDouble, etc).

P.S. Just a side note about the amount of bits: GT can't really interpret the amount of bits set per sample (the only case it can handle is the BitCellType (NBITS = 1)).

thomas-maschler commented 3 years ago

Ok, thanks for clarifying @pomadchin. But should I get the correct pixel values, even if the cell type is float?

As a sidenode: We use the nbits for better compression of our files in cases where the value range permits us to do so. Files stored with nbit 17 are about ~40%. smaller compared to the full 32bit. In. general, GDALRasterSource seems to be fine handling those files.

moradology commented 2 years ago

Looks like we don't currently support nbits, however, this is potentially the spot to begin investigation of a strategy for providing support: https://github.com/locationtech/geotrellis/blob/master/raster/src/main/scala/geotrellis/raster/io/geotiff/BandTypes.scala#L38-L51