LeoHsiao1 / pyexiv2

Read and write image metadata, including EXIF, IPTC, XMP, ICC Profile.
GNU General Public License v3.0
196 stars 39 forks source link

Auto register additional XMP namespaces #128

Closed molexx closed 6 months ago

molexx commented 7 months ago

For some images when modifying XMP data using the dict the call to modify_xmp() raises something like:

RuntimeError: No namespace info available for XMP prefix `Item'

Manually adding a call to registerNs() fixes this.

Perhaps pyexiv2 could automatically register any namespaces it encounters during read_xmp() please?

Here's a breaking example, GooglePixel8.jpg is from https://exiftool.org/Google.tar.gz

import pyexiv2

with pyexiv2.Image("GooglePixel8.jpg") as pyexiv2_imagedata:

    # read xmp data as a dict
    xmp_dict = pyexiv2_imagedata.read_xmp()

    xmp_raw = pyexiv2_imagedata.read_raw_xmp()
    print('xmp_raw:\n' + xmp_raw)

    # uncomment the below to pass
    # pyexiv2.registerNs('http://ns.google.com/photos/1.0/container/item/', 'Item')

    # write the xmp dict back to the ImageData
    pyexiv2_imagedata.modify_xmp(xmp_dict)
LeoHsiao1 commented 7 months ago

Hi! According to the documentation https://exiv2.org/manpage.html#78-xmp-namespaces , exiv2 has standards defined prefixes and namespaces built-in. So this kind of error reporting, No namespace info available for XMP prefix, can remind users to check that they are writing the correct key. For example, some users may write EXIF, IPTC key when calling img.modify_xmp(). Therefore, I prefer to ask the user to manually call img.registerNs().

Similarly, when the user modifies EXIF or IPTC metadata, exiv2 does not allow writing non-standard keys or even registering custom keys.

molexx commented 7 months ago

Hi and thanks.

I see your point that it is good to make a user explicitly create a non-standard tag.

But in this case a user is making a minor edit - perhaps just description - to an existing image. They don't know what the existing tags are but they want them to be kept. There is also the case where we are converting an uploaded image and just copying the existing metadata over to the new image. If read_xmp() discovers a non-standard namespace could it assume it is valid? Perhaps it could be an option, defaulted False?

If that's not right for pyexiv2 then I'd appreciate your thoughts on the best way to do it please: I am thinking that between every read_xmp() and modify_xmp() I could additionally call read_raw_xmp(), search the xml for namespaces (hopefully doable with regex to save time xml parsing the whole thing?) and call registerNs() for every namespace I find.

LeoHsiao1 commented 7 months ago

If you are just copying existing metadata to a new image, the following code is recommended:

data = img1.read_raw_xmp()
img2.modify_raw_xmp(data)

This way, exiv2 only reads and writes a piece of text in XML format. exiv2 does not parse what keys are contained in it, whether the format is correct or not. Another advantage is that processing raw_xmp directly is faster than parsing XML text.

molexx commented 6 months ago

Thanks. The copy works and standard keys can be updated using the dict so I'm good for now.

At some point in the future I'm going to need to let a user edit any of the existing fields, and it'll be easier to use the dict rather than the XML for that, so when I get to that point I'll implement something as described above.