Closed OmlineEditor closed 3 weeks ago
If I run pngcheck over your image, I get
CRC error in chunk pHYs (computed eee74573, expected c76fa864)
To skip the check in Pillow, use
from PIL import Image, ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
image_path = "bug.png"
image = Image.open(image_path)
Same issue with convert
, although macOS Preview opens it.
% convert bug.png bug.png
convert: pHYs: CRC error `bug.png' @ warning/png.c/MagickPNGWarningHandler/1526.
Actually, convert
fixes it:
% convert bug.png bug.png
convert: pHYs: CRC error `bug.png' @ warning/png.c/MagickPNGWarningHandler/1526.
% pngcheck bug.png
OK: bug.png (579x864, 24-bit RGB, non-interlaced, 57.7%).
% convert bug.png bug.png
%
ImageFile.LOAD_TRUNCATED_IMAGES = True
This code helps solve the issue, but it's crucial to ensure there won't be any issues when processing the image further. Could this code affect the functionality, potentially causing problems down the line?
Apart from skipping some checks with PNGs, the other behaviour of LOAD_TRUNCATED_IMAGES
is to try and load images that end prematurely.
The internal Pillow data will not be in a corrupted state, no, all operations on the loaded image will be as valid as they ever were. This is just ignoring the fact that the pixels being read from the image are perhaps not what they are supposed to be.
Okay, thanks for the help. The problem in the scanner that cannot correctly calculate the control amount for the file. You can make changes to the code so that there is no error and the message was shown - the file is damaged and has not the right CRC? If there is a message about the CRC problem, and not the error will be better and more understandable then.
You're requesting that we only raise a warning in this situation?
If the image is corrupted or ends prematurely, I think we both agree that users should know there is something wrong. Whether the user would want to continue using a flawed image anyway is a matter of personal preference, and so there is a setting for it. I'd like there to be a stronger argument before changing Pillow's default setting.
The meaning behind UnidentifiedImageError
is documented, specifically mentioning this PNG behaviour - https://pillow.readthedocs.io/en/stable/PIL.html#PIL.UnidentifiedImageError
As some background, the error behaviour has been here since the fork from PIL. It was only #1991 that allowed LOAD_TRUNCATED_IMAGES
to workaround it.
You might be interested to know that
from PIL import PngImagePlugin
PngImagePlugin.PngImageFile("bug.png")
will show you the SyntaxError directly.
Traceback (most recent call last):
File "demo.py", line 6, in <module>
PngImagePlugin.PngImageFile("bug.png")
File "PIL/ImageFile.py", line 137, in __init__
self._open()
File "PIL/PngImagePlugin.py", line 733, in _open
self.png.crc(cid, s)
File "PIL/PngImagePlugin.py", line 209, in crc
raise SyntaxError(msg)
SyntaxError: broken PNG file (bad header checksum in b'pHYs')
Agree this is an error and we're not going to change to warning. Also super-interesting that the PngImagePlugin
raises SyntaxError
and reveals the bad checksum. The only change I'd consider making here is to add an option similar to LOAD_TRUNCATED_IMAGES
to enable more verbose output from Pillow when the image plugin fails to return an open image to ImagePlugin._open
. Not sure what that would look like or if there are any existing verbose options in Pillow, but something like --show-me-what-really-happened
.
Okay, let it show an error, but not just “cannot identify image file”. Let there be a more detailed and understandable error, just change only the text of the error message to: “cannot identify image file, the file is damaged, the file has an incorrect CRC signature”
That's not as easy as it sounds.
By default, Pillow checks your image against multiple formats. Some formats can be easily rejected because your image data does not start with the required identifier, but not all.
So if I adjust Pillow to print out the errors raised by any formats against your image
diff --git a/src/PIL/Image.py b/src/PIL/Image.py
index c65cf3850..ab41f525f 100644
--- a/src/PIL/Image.py
+++ b/src/PIL/Image.py
@@ -3333,10 +3333,11 @@ def open(
im = factory(fp, filename)
_decompression_bomb_check(im.size)
return im
- except (SyntaxError, IndexError, TypeError, struct.error):
+ except (SyntaxError, IndexError, TypeError, struct.error) as e:
# Leave disabled by default, spams the logs with image
# opening failures that are entirely expected.
# logger.debug("", exc_info=True)
+ print(i+": "+str(e))
continue
except BaseException:
if exclusive_fp:
I get
PNG: broken PNG file (bad header checksum in b'pHYs')
IM: Syntax error in IM header: �PNG
IMT: not identified by this driver
IPTC: invalid IPTC/NAA file
MPEG: not an MPEG file
PCD: not a PCD file
SPIDER: not a valid Spider file
TGA: not a TGA file
I imagine you don't want to see all of that.
I imagine you don't want to see all of that.
This is how it became clearer, let there be more messages to understand where the error is and how to fix it.
Those messages would show even if the image opened successfully, because all of the other attempted formats would print their failures.
I imagine you don't want to see all of that.
I think I'd like to be able to say Image.verbose = True
and see all that, but I expect that also may not be as easy as it sounds to implement.
It looks like warnings are added to a list that gets shown at the end if the image can't be opened. Exception messages could probably be treated similarly.
I've created https://github.com/python-pillow/Pillow/pull/8033 to allow Image.open("bug.png", warn_possible_formats=True)
to show the various exceptions as warnings, but only if the image is not able to be opened successfully. See what you think.
I'm not sure about the scalability of adding Boolean flags here and there.
How about adding it to a logger?
I feel the concern about scalability, but as for a logger, as @nulano pointed out, this is something that previously existed, but was removed in #1423.
I am cautious about making decisions and then undoing them. @wiredfool, as the author of #1423, do you have any thoughts on this?
This is a "nice to have" so I wouldn't add anything for logging or to increase verbose output unless "no other way forward". In this case, it's unfortunate to not get the appropriate information right away, but certainly not critical for us to fix it.
While I'm not sure we should do either of these, I have thought of two options:
File "/usr/local/lib/python3.9/dist-packages/PIL/Image.py", line 3339, in open
raise UnidentifiedImageError(msg)
PIL.UnidentifiedImageError: cannot identify image file '/var/www/python_for_site/bug.png'
The following warnings were raised while attempting to open the file:
PNG: broken PNG file (bad header checksum in b'pHYs')
IM: Syntax error in IM header: �PNG
IMT: not identified by this driver
IPTC: invalid IPTC/NAA file
MPEG: not an MPEG file
PCD: not a PCD file
SPIDER: not a valid Spider file
TGA: not a TGA file
- Add a global setting (similar to MAX_IMAGE_PIXELS) - I agree that a new function parameter for debugging is not very scalable, but a global setting (perhaps even reused from other functions) would not complicate the interface too much.
Right, global setting is what I suggested here too.
Append all detected issues to the raised UnidentifiedImageError
If you append based on the global setting, probably OK. If not, probably not.
I've created #8063 with Image.WARN_POSSIBLE_FORMATS
What did you do?
img = Image.open(path) # I open a file in a script
What did you expect to happen?
script open file with script will continue execution
What actually happened?
an error occurs:
What are your OS, Python and Pillow versions?
My Code:
I worked a lot with files, but I can’t open this file even though it’s normal. I can’t open a single file that I scan on a scanner in PNG format.
cannot_identify_image_file.zip