Closed perrygeo closed 8 years ago
I'm gonna jump on this @perrygeo. The strategy I am thinking is:
Why is this different? I'm not sure. Failing test case here: #59
IDK what is happening here
So I made a little geojson util that might help visualize internal tiling issues: https://github.com/mapbox/rio-block-features.
Noticed two things:
src.block_windows
on untiled rasters yields windows that are 16 px height, full width. (untiled tiffs are 16px "strips" by default)Still not sure what the solution is but iterating processing these 16px tall blocks appears to be related.
@perrygeo ya, the 16px height blocksize comes from it being jpeg compressed, I think.
I manually stopped any filling from occurring and just returned the window, and am still seeing the shift:
This behavior probably has something to do with either (a) boundless reading of full-width windows in rasterio, (b) or my padding + unpadding logic (but why fail only on these)?
Grasping at straws here, wonder if it has anything to do with this: https://github.com/mapbox/rasterio/issues/520#issue-115078838
tldr: writing to raster where blocksize is > image size causes shifting.
Maybe blockxsize == width causes the same issue?
@dnomadb question about the test fixtures, How did you create the untiled fixture? The tiled and untiled are different right off the bat
$ raster-tester compare --no-error seams_4band.tif seams_4band_untiled.tif
NOT OK - Band 1 has 1235 pixels that vary by more than 16
$ rio info seams_4band_untiled.tif
{"count": 4, "crs": "EPSG:3857", "interleave": "pixel", "dtype": "uint8", "driver": "GTiff", "transform": [8.0, 0.0, -13550756.3744, 0.0, -8.0, 6315533.02503], "lnglat": [-121.70652486688023, 49.22476090053553], "height": 612, "width": 612, "shape": [612, 612], "tiled": false, "res": [8.0, 8.0], "nodata": null, "bounds": [-13550756.3744, 6310637.02503, -13545860.3744, 6315533.02503], "compress": "jpeg"}
If I use gdal_translate, I can get a pixel-perfect copy
$ gdal_translate -co TILED=FALSE seams_4band.tif seams_4band_untiled2.tif
$ raster-tester compare --no-error seams_4band.tif seams_4band_untiled2.tif
OK - seams_4band.tif is similar to within 0 pixels of seams_4band_untiled2.tif
$ rio info seams_4band_untiled2.tif
{"count": 4, "crs": "EPSG:3857", "interleave": "pixel", "dtype": "uint8", "driver": "GTiff", "transform": [8.0, 0.0, -13550756.3744, 0.0, -8.0, 6315533.02503], "lnglat": [-121.70652486688023, 49.22476090053553], "height": 612, "width": 612, "shape": [612, 612], "tiled": false, "res": [8.0, 8.0], "nodata": null, "bounds": [-13550756.3744, 6310637.02503, -13545860.3744, 6315533.02503]}
Still, even this one fails with the weird shifting behavior. But we should update the test fixture to reduce the number of extraneous variables here.
not sure if this is relevant
The strip height of the untiled raster created by gdal_translate is 3px height. Found this in the GDAL docs
By default stripped TIFF files are created... strip height defaults to a value such that one strip is 8K or less
Here's a summary of my observations so far. This is weird. I'm perplexed, flummoxed and bewildered.
1 + max_search_distance
pixels to the right:confused: There's a blocksize > imagesize issue in rasterio, so maybe they share an underlying bug?
I've isolated the problem, no fix yet but at least we have a specific failing test case... There appears to be a bug in rasterio boundless reads.
Check out the test case. In a nutshell, reading an unpadded window should be the same as reading a padded window then subtracting the padding with an index. But it's not in the case of untiled block windows.
@dnomadb Two things before I post it to the rasterio issue tracker
seams_4band_untiled.tif
from this repo./cc @sgillies
A slightly more tidy test case.
>>> src.width
612
>>> src.read(boundless=True, window=((100, 101), (0, 612)))[0, 0, 0:9]
array([68, 45, 63, 67, 73, 78, 87, 79, 81], dtype=uint8)
>>> src.read(boundless=True, window=((100, 101), (-1, 612)))[0, 0, 0:9]
array([ 0, 68, 45, 63, 67, 73, 78, 87, 79], dtype=uint8)
>>> src.read(boundless=True, window=((100, 101), (-1, 613)))[0, 0, 0:9]
array([ 0, 0, 68, 45, 63, 67, 73, 78, 87], dtype=uint8) # WTF
Notice the last one, we shift the row stop (window[1][1]
) to > image width, and it shifts the start row.
Also important to note: we see this behavior even on tiled tiffs. So it's a math error related to windows and image size, not internal tiling!
Moving bug fix discussion to rasterio https://github.com/mapbox/rasterio/issues/532
closing this since it's not a nodata
bug
Once again with the internal tile blocks...
Take two rasters that are pixel-identical but one is tiled and the other is not. Run them through nodata blob with tiled output and the results differ:
I think we can get around it in pxm by forcing tiling beforehand but we should address it here as well.
Here's the script I was using to recreate.