LeoHsiao1 / pyexiv2

Read and write image metadata, including EXIF, IPTC, XMP, ICC Profile.
GNU General Public License v3.0
202 stars 39 forks source link

Expose `ExifKey` from `libexiv2` for easy interoperability #147

Open cmahnke opened 3 weeks ago

cmahnke commented 3 weeks ago

As far as I can see it's currently not possible to use Exif tag numbers directly. A use case would be better interoperability with other modules like pillow. A solution could be the following:

This way it would possible have a better interoperability with fine grained control (no need to copy all using a byte buffer).

LeoHsiao1 commented 3 weeks ago

Well. Most users don't care about tag numbers, so pyexiv2 never reads or writes them.

  1. Reading the tag number is quite simple, I just need to call the exiv2 API tag(). I can read numbers like this:

    [tag name  ] Exif.Image.Artist
    [tag number] 315
    [tag name  ] Exif.Image.Rating
    [tag number] 18246
    [tag name  ] Exif.Image.RatingPercent
    [tag number] 18249
    [tag name  ] Exif.Image.Copyright
    [tag number] 33432
    [tag name  ] Exif.Image.ExifTag
    [tag number] 34665

    I'll add this feature in next release of pyexiv2.

  2. Writing metadata based on the tag number can be tricky. Most users enter only the tag name, not the tag number. So pyexiv2 has to automatically determine the tag number corresponding to each tag name. This requires storing a mapping table, in pyexiv2 source code. But I don't understand why Pillow needs to write metadata based on tag number. To make it easier to code, I don't even respect the tag type. The exif tag has multiple data types, but I usually write it as str. https://github.com/LeoHsiao1/pyexiv2/blob/cbc6765da9d9ea89486e43e0532e43f7822e0431/pyexiv2/convert.py#L103-L104

  3. As for the byte buffer, When exiv2 opens an image, it must call img->readMetadata() to load all the metadata, discover the byte offset of each tag. So it can't read or write only one tag.

cmahnke commented 3 weeks ago

Well, thanks for looking into it, my proposal was a bit simpler: just provide a translation table...

A made up example would be:

img_exif = image.getexif()
exiv2_dict = {}
for k, v in img_exif.items():
    tag = ExifKey.from_tag(k)
    exiv2_dict[tag.key] = v

Note: This example doesn't do any type check for the value - this would be a responsibility of the user.

Where ExifKey would have the following methods:

This way there would be two ways for interoperbility: by number and by name, since it's possible to use names with Pillow, but these don't follow the hierachy.

References:

Using this approach would avoid the need to be able to use the number for writing, i's up to the programmer to do the mapping for writing...

LeoHsiao1 commented 3 weeks ago

I tried calling ExifKey() of exiv2:

Exiv2::ExifKey key = Exiv2::ExifKey(34665, "Image");
std::cout << key.key()        << std::endl;
std::cout << key.familyName() << std::endl;
std::cout << key.groupName()  << std::endl;
std::cout << key.tagName()    << std::endl;
std::cout << key.tag()        << std::endl;

It outputs:

Exif.Image.ExifTag
Exif
Image
ExifTag
34665

I noticed that exiv2 uses decimal for the tag number, while pillow uses hexadecimal, but that's easy to convert. This code can be wrapped into a python function and then called by Pillow. It does work with one tag at a time, without opening the image.

However, here's the bad news. When calling ExifKey(), you need to enter not only the tag number, but also the groupName. Because there are multiple standards for exif, tag numbers can be duplicated. For example:

0x0001  Exif.Canon.CameraSettings
0x0001  Exif.Nikon1.Version
0x0001  Exif.Samsung2.Version

This is a paradox: now that you know the groupName, you should already know the full tag name as well. So you don't need to convert the tag number into the tag name.

Alternatively, we can manually save all possible exif tags, and their corresponding tag numbers, as a Python dict. But this Python dict needs to be updated frequently so that it is synchronized with exiv2. https://github.com/Exiv2/exiv2/blob/main/src/tags_int.cpp https://exiv2.org/metadata.html