OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.8k stars 2.51k forks source link

rio-cogeo needed more than 20Go of memory on a 18 MB GeoTIFF image #10875

Closed lawrencenika closed 2 days ago

lawrencenika commented 2 days ago

What is the bug?

for some reason those lines https://github.com/cogeotiff/rio-cogeo/blob/d79b0d6ce79f68349a27047d61d19b87e6055541/rio_cogeo/cogeo.py#L746-L750 is causing some issue with rio-cogeo needed more than 20Go of memory. This causes the

Steps to reproduce the issue

Follow the steps specified in https://github.com/developmentseed/titiler/discussions/992

Versions and provenance

latest used by titiler repo is GDAL 3.8 but as tested, this is also happening for GDAL 3.9

Additional context

This was found when trying to validate a geotiff file using titiler, that uses rio-cogeo library.

jratike80 commented 2 days ago

Maybe the same issue makes gdalinfo to use much CPU and memory when it reads the metadata

gdalinfo "Sentinel-2 L2A False color.tiff (1)" -listmdd --debug on

lawrencenika commented 2 days ago

Maybe the same issue makes gdalinfo to use much CPU and memory when it reads the metadata

gdalinfo "Sentinel-2 L2A False color.tiff" -listmdd --debug on

should I try to debug with this line? what can I do to help investigate further here

jratike80 commented 2 days ago

There must be something that is uncommon in the source file. Debugging the code may be useless before understanding what makes the file special. Even this command takes a long time

gdal_translate "Sentinel-2 L2A False color (1).tiff" remove_me.tif -co copy_src_mdd=no
Input file size is 1250, 1221
Warning 1: TIFFReadDirectory:Ignoring TransferFunction because BitsPerSample=32>24
Warning 1: TIFFReadDirectory:Ignoring TransferFunction because BitsPerSample=32>24
lawrencenika commented 2 days ago

yeah, agree. just in retrospect, maybe a very important context will help. The original source file is actually https://storage.googleapis.com/raster-datasources/Sentinel-2%20L2A%20False%20color.tiff which is a 5MB file downloaded from sentinel hub. But as our team were sending the file to each other via Slack, the file uploaded to and then downloaded from Slack becomes 18MB file (we were not sure why) that later started to crash titiler for us.

vincentsarago commented 2 days ago

FYI slack modify the TIFF (creates non-valid COG for example). When sharing a file via slack you should .zip it

lawrencenika commented 2 days ago

is there a way for us to tell if the file is corrupted by slack or other softwares? instead of causing this to crash the whole server? Ideally everyone should just .zip it, but i cant guarantee all users are following that right

rcoup commented 2 days ago

FYI slack modify the TIFF (creates non-valid COG for example). When sharing a file via slack you should .zip it

Wait, what? They rewrite TIFF files?

Apparently, yes: https://digitalflapjack.com/blog/slack-bad-for-gis-rasters/

heads off to file a support ticket ✔️

rouault commented 2 days ago

This is a multiple software bug. The generating software should (likely. not totally sure though) not include the following tags when generating a 32-bit file

Whitepoint (318) RATIONAL (5) 2<0.3127 0.329>
PrimaryChromaticities (319) RATIONAL (5) 6<0.64 0.33 0.3 0.6 0.15 0.06>

This is also a bug of libtiff to try to allocate 1 << 32 bytes when synthetizing a transfer function from those. Fix queued in https://gitlab.com/libtiff/libtiff/-/merge_requests/665

rouault commented 2 days ago

And for good measure GDAL fix in 61555de240

jratike80 commented 2 days ago

But as our team were sending the file to each other via Slack, the file uploaded to and then downloaded from Slack becomes 18MB file (we were not sure why) that later started to crash titiler for us.

I do not know what your team is doing and if having georeferenced images is an advantage, but did you notice that this procedure also drops the georeferencing?

lawrencenika commented 2 days ago

But as our team were sending the file to each other via Slack, the file uploaded to and then downloaded from Slack becomes 18MB file (we were not sure why) that later started to crash titiler for us.

I do not know what your team is doing and if having georeferenced images is an advantage, but did you notice that this procedure also drops the georeferencing?

Wasnt doing anything fancy, it was the first time we tried to send a geotiff file to each other before uploading to titiler, because one guy was failing to use titiler well and sending to another guy to see if it can be resolved. But the second guy then broke the titiler server because the file was transmitted via slack. Yes it does drop the georeferencing as I saw it in titiler error log.

hobu commented 2 days ago

I do not know what your team is doing and if having georeferenced images is an advantage, but did you notice that this procedure also drops the georeferencing?

Slack modifies TIFFs and strips off tags including geolocation information.