Closed vincentsarago closed 6 years ago
afaik there isn't a limit on the header size for either tiff or bigtiff, as the header can have a flexible number of tags and these tags can have a different number of values. Picking a large enough chunk size and being prepared to go back for more seems to be an approach. Ideally chunk size should be configurable.
I update most of the change and it should work now. I need to add more test to make sure everything is fine
cc @normanb
@normanb I'm not sure I can make test coverage better. I think this PR is ready for final review.
Just noting that by default we fetch the first 1024
bytes
https://github.com/mapbox/COGDumper/blob/35de35bd4bfa929a72674755e17521065eb6c950/cogdumper/cog_tiles.py#L217
while GDAL fetch the first 16k. We could change that to make it more GDAL like and/or are a environment variable to define the size of the first fetch.
@normanb in https://github.com/mapbox/COGDumper/pull/5/commits/7d9849258fa86649b52cea3cc3f106ea266ae72c I added a environment variable COG_INGESTED_BYTES_AT_OPEN
used to configure the size of the first read as GDAL 2.3 is doing http://gdal.org/gdal_virtual_file_systems.html#gdal_virtual_file_systems_vsicurl.
I also added a logging option to 👀 the number and size read.
cogdumper file --file cog_deflate.tif --xyz 0 0 0 -v
INFO:cogdumper.filedumper:Reading bytes: 0 to 16383
INFO:cogdumper.filedumper:Reading bytes: 11434145 to 11831775
I guess if this is 👌 you can do a final review and merge it :-)
+1 Thanks @vincentsarago for these changes, it greatly improves performance.
@normanb should we ship a new release with the change implemented here ?
closes #4 This PR try to implement
chunk
reading for header retrieval to limit the number ofread
callsTo Do
Right now I'm reading the first 8000 bytes of a file and decode the header. It works for normal TIFF but not for BIGTIFF. @normanb do you now if there is a size limit for GeoTIFF headers ?