pywikibot-catfiles / file-metadata

A python package to analyze files and provide useful metadata
MIT License
15 stars 1 forks source link

PIL: DecompressionBombWarning #33

Closed AbdealiLoKo closed 8 years ago

AbdealiLoKo commented 8 years ago

From https://travis-ci.org/AbdealiJK/file-metadata/jobs/136607124

445 . Analyzing File:Herman Moll. The Turkish Empire in Europe, Asia and Africa. 1752.jpg
WARNING: /home/travis/miniconda/envs/py2.7/lib/python2.7/site-packages/PIL/Image.py:2238: DecompressionBombWarning: Image size (104956452 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack.
WARNING:py.warnings:/home/travis/miniconda/envs/py2.7/lib/python2.7/site-packages/PIL/Image.py:2238: DecompressionBombWarning: Image size (104956452 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack.
>           raise ValueError(error_message)
E           ValueError: Could not load "" 
E           Reason: "image file is truncated (66 bytes not processed)"
E           Please see documentation at: http://pillow.readthedocs.org/en/latest/installation.html#external-libraries
AbdealiLoKo commented 8 years ago

It seems PIL is just trying to tell us that this file is HUGE, although it is very small because of compression. So, if PIL decompresses this file into memory, it may pretty much hang/crash your system.

For example the file https://commons.wikimedia.org/wiki/File:Grand_paris_express.png is 17,688 × 14,315 pixels but only 4.78 MB (it was generated from SVG). If that is loaded into a 2D int array, it would need 253,203,720 pixels ~ 1GB (assuming int is 4bytes) of memory to just open the file.

In such cases, we should probably throw a better error saying "This file is huge." and give a override flag to force PIL to read the image.

Reference : http://stackoverflow.com/questions/25705773/image-cropping-tool-python

jayvdb commented 8 years ago

If I understand correctly you have two things here ... a python warning, and then a value error, probably caused by an out of memory error.

Ideally you need to catch the warning, and then dont proceed to process the file by default, as it will kill the users workstation (Linux is not very good at managing a process unexpectedly requesting very large chunks of memory). A parameter or config option could be used to override this, and try to load the file.

AbdealiLoKo commented 8 years ago

warnings.simplefilter('error', Image.DecompressionBombWarning) can be used to convert the warning into an error and then catch it

drtrigon commented 8 years ago

And soon after doing first analysis scale it down if possible.

Am 10. Juni 2016 09:18:47 MESZ, schrieb AbdealiJK notifications@github.com:

It seems PIL is just trying to tell us that this file is HUGE, although it is very small because of compression. So, if PIL decompresses this file into memory, it may pretty much hang/crash your system.

For example the file https://commons.wikimedia.org/wiki/File:Grand_paris_express.png is 17,688 × 14,315 pixels but only 4.78 MB (it was generated from SVG). If that is loaded into a 2D int array, it would need 253,203,720 pixels ~ 1GB (assuming int is 4bytes) of memory to just open the file.

In such cases, we should probably throw a better error saying "This file is huge." and give a override flag to force PIL to read the image.

Reference : http://stackoverflow.com/questions/25705773/image-cropping-tool-python


You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/AbdealiJK/file-metadata/issues/33#issuecomment-225110835

Dr. Trigon