locationtech / geotrellis

GeoTrellis is a geographic data processing engine for high performance applications.
http://geotrellis.io
Other
1.33k stars 360 forks source link

Striped BitGeoTiffs incorrect support #3112

Open pomadchin opened 4 years ago

pomadchin commented 4 years ago

PR https://github.com/locationtech/geotrellis/pull/3102 addresses Tiled BitGeoTiffs support, and allows to read such tiles, convert, transform, and to perform all operations with Tiled BitGeoTiffs.

However, the following code would fail:

// put it into `BitGeoTiffTileSpec` and run via 
// `testOnly geotrellis.raster.io.geotiff.BitGeoTiffTileSpec -- -z "should convert striped tiffs"` cmd
it("should convert striped tiffs") {
  val tiff = SinglebandGeoTiff(geoTiffPath("bilevel.tif"))
  val tile = tiff.tile.toArrayTile()

  // check that it is possible to convert bit cellType to bit cellType
  // it will fail on the conversion step
  val tiffTile = tile.toGeoTiffTile(GeoTiffOptions.DEFAULT.copy(storageMethod = Striped)).convert(BitCellType)
  assertEqual(tiffTile.toArrayTile(), tile.toArrayTile)
}

It turned out, the problem is more deep, and it looks like we don't have a fully correct support of StripedBitGeoTiffs`. Even if we will try to persist tiff using the following code:

val stripedOptions =  GeoTiffOptions.DEFAULT.copy(storageMethod = Striped)
val tiffTile = tile.toGeoTiffTile(stripedOptions)
SinglebandGeoTiff(tiffTile, tiff.extent, tiff.crs, Tags.empty, stripedOptions)
  .write("/tmp/bilevel-striped.tiff")

The output would look incorrect:

image

The original image:

image

This is caused by the way we compute rowsPerStrip: https://github.com/locationtech/geotrellis/blob/b9ca38cffc86fb846534a51436f71629ea2c37eb/raster/src/main/scala/geotrellis/raster/io/geotiff/StorageMethod.scala#L32-L43

In this particular example with the bilevel.tif the result of this function computation would be 8 (rowsPerStrip(rows, Bit) == 8). If we'll force the output of this function to equal 1 (1 row per strip: if(ris == 0) 1 else math.min(ris, 1)) the result image would look correct, but there would still be issues with performing any operations on such BitGeoTiffTile: convert, foreach, map, etc - it looks like the BitGeoTiff paddings logic works only for the tiled case.

The task is to fix Striped BitGeoTiffs support. Also it may be the case, that we have a not fully working general Striped TIFFs support and it just worked because we usually never face a case when we had computed SegmentLayouts with more than a single row per strip.

As a reference related to why QGIS fails reads Striped BitTiffs created by GeoTrellis: mb there is smth in how we are setting RowsPerStrip TIFF Tag.

This guess was done that in this case GeoTiffReader reads tags as it has 72 rows per strip, but gdalinfo and exiftool read it as 8 (8 is a correct number).

An example of a .convert(BitCellType) stack trace using the unit test posted in the beginning of this post (segment layout is incorrect? or in case of a striped layout paddings should be calculated differently?):

[info] - should convert striped tiffs *** FAILED *** (1 second, 184 milliseconds)
[info]   java.lang.ArrayIndexOutOfBoundsException: 910
[info]   at geotrellis.raster.io.geotiff.BitGeoTiffSegment.get(BitGeoTiffSegment.scala:49)
[info]   at geotrellis.raster.io.geotiff.BitGeoTiffSegment.getInt(BitGeoTiffSegment.scala:36)
[info]   at geotrellis.raster.io.geotiff.GeoTiffSegment$class.convert(GeoTiffSegment.scala:53)
[info]   at geotrellis.raster.io.geotiff.BitGeoTiffSegment.convert(BitGeoTiffSegment.scala:24)
[info]   at geotrellis.raster.io.geotiff.GeoTiffTile$$anonfun$convert$1.apply(GeoTiffTile.scala:324)
[info]   at geotrellis.raster.io.geotiff.GeoTiffTile$$anonfun$convert$1.apply(GeoTiffTile.scala:323)
[info]   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
[info]   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
[info]   at geotrellis.raster.io.geotiff.GeoTiffTile.convert(GeoTiffTile.scala:323)
[info]   at geotrellis.raster.io.geotiff.BitGeoTiffTileSpec$$anonfun$1$$anonfun$apply$mcV$sp$7.apply$mcV$sp(BitGeoTiffTileSpec.scala:102)
pomadchin commented 4 years ago

Potentially the issue is caused by the bits padding we do. I could fix partially tiff conversion by using paddedCols to compute pixel position in a segment during the conversion step, but it also introduced a slight image shift (and this is true only in some cases, depending on how the segment layout is sliced).

The difference between tiled and striped tiffs in this case is that all tiled segments require no padding and they are already 'padded'. Striped tiffs are occupying the minimum required space to encode bits.

Padding here means to adjust number of array size that backs segment to fit bits but in bytes ((cols / 7) * 8)

Leaving a GDAL reference here: https://github.com/OSGeo/gdal/blob/master/gdal/frmts/gtiff/geotiff.cpp#L6594

echeipesh commented 4 years ago

There is a partial fix to this in #3129. Full solution issue us going to the backlog for a little while as our priorities shift. Since we're shifting to expectations that GeoTiffs we're working with are COGs (tiled, not striped) and bit geotiff is relatively uncommon.