Closed ckanaar closed 1 year ago
Hi. Looking at your traceback, I think we actually have a fix in progress for this already - #7274. Are you able to test that and see if it solves your problem?
Thank you for the reply. How can I test this fix though, as it seems to be implemented in the source code? (I'm rather new to this, so apologies for any obvious questions).
No worries, not a silly question at all.
You could a) download the code from https://github.com/radarhere/Pillow/tree/jpeg_xmp (the branch of the PR) and install Pillow from source out of that directory b) since it is just a Python change, it is likely you could just determine the path to your installed copy of Pillow, open up Image.py in your favourite editor, and change the code directly
But the simplest, least invasive method would be c) just monkeypatch Pillow - meaning, replace the code at runtime.
So, for Option C, just run the following code and kindly report back if it works.
from PIL import Image
def _getxmp(self, xmp_tags):
def get_name(tag):
return re.sub("^{[^}]+}", "", tag)
def get_value(element):
value = {get_name(k): v for k, v in element.attrib.items()}
children = list(element)
if children:
for child in children:
name = get_name(child.tag)
child_value = get_value(child)
if name in value:
if not isinstance(value[name], list):
value[name] = [value[name]]
value[name].append(child_value)
else:
value[name] = child_value
elif value:
if element.text:
value["text"] = element.text
else:
return element.text
return value
if ElementTree is None:
warnings.warn("XMP data cannot be read without defusedxml dependency")
return {}
else:
root = ElementTree.fromstring(xmp_tags)
return {get_name(root.tag): get_value(root)}
Image.Image._getxmp = _getxmp
image = Image.open("lib/filename_example_1.jpg")
raw_xmp_info = image.getxmp()
Thank you for your reply. I have tested my JPEG using the code snippet you provided and it succesfully loads the XMP data. Therefore it works.
What did you do?
I'm loading a .jpg image using Pillow in an attempt to extract the XMP metadata:
What did you expect to happen?
I'm expecting
getxmp()
to return a dictionary containing the image's XMP metadata:{'xmpmeta: {<metadata>}
.What actually happened?
Pillow can't extract the XMP metadata from the image:
What are your OS, Python and Pillow versions?
Additional information.
Printing
image.info
reveals:Due to confidentiality reasons, I can not provide the image that I'm trying to process. However, I will provide the anonomised XMP metadata of the image that causes the error (
filename_example_1.jpg
) and one which doesn't cause an error (filename_example_2.jpg
). This XMP metadata was extracted using https://www.imgonline.com.ua/. This shows that the metadata is present for both the working and the erroneous files.filename_example_1.jpg
(erroneous):filename_example_2.jpg
(successfully loaded by Pillow):If anyone is able to provide some insights into this bug, please let me know!