LeoHsiao1 / pyexiv2

Read and write image metadata, including EXIF, IPTC, XMP, ICC Profile.
GNU General Public License v3.0
196 stars 39 forks source link

Copying exif data containing duplicate keys #133

Closed molexx closed 5 months ago

molexx commented 5 months ago

We have an image that for some reason (?) contains two Exif.Image.Orientation tags. pyexiv2 handles this by returning an array with the two values, which is fine, but then when I try to copy the exif data to a new image:

pyexiv2_imagedata_target.modify_exif(pyexiv2_imagedata.read_exif())

modify_exif() raises: AttributeError: 'list' object has no attribute 'encode'.

Here's a snippet of exiv2's output on the image:

$ exiv2 -pa pr containsTwoExifOrientations.jpg 
Exif.Image.Make                              Ascii       6  Apple
Exif.Image.Model                             Ascii      10  iPhone 4S
Exif.Image.Orientation                       Short       1  top, left
Exif.Image.Orientation                       Short       1  top, left
Exif.Image.XResolution                       Rational    1  72
Exif.Image.YResolution                       Rational    1  72
Exif.Image.ResolutionUnit                    Short       1  inch
Exif.Image.Software                          Ascii      16  QuickTime 7.7.1
Exif.Image.DateTime                          Ascii      20  2013:07:08 22:51:49
Exif.Image.HostComputer                      Ascii      16  Mac OS X 10.8.4
Exif.Image.YCbCrPositioning                  Short       1  Centered

It doesn't mention orientation at all if I don't specify the -pa option.

Here's a snippet of the dict from read_exif():

{
 'Exif.Image.Orientation': ['1', '1'],
}

I'm not sure what the expected behaviour is here, or if I should guard against it safely, thoughts are appreciated please.

LeoHsiao1 commented 5 months ago

Hi! According to the documentation of exiv2, https://exiv2.org/tags.html, there are various data types for image metadata, such as short, ascii, and array. Therefore, when pyexiv2 reads image metadata, it automatically converts it to a list type if there are multiple values. https://github.com/LeoHsiao1/pyexiv2/blob/aa751f0b69b37a4c408777ff11aa4da7d005b568/pyexiv2/convert.py#L54-L61

However, however, the array data type does not theoretically exist for EXIF metadata. So when pyexiv2 modifies the EXIF metadata, it treats the array data type as str type. So the following error occurs:

>>> img.modify_exif({'Exif.Image.Orientation': ['1', '1']})
Traceback (most recent call last):
    self.img.modify_exif(self._dumps(data), encoding)
AttributeError: 'list' object has no attribute 'encode'
LeoHsiao1 commented 5 months ago

I can modify the code in pyexiv2 to let it handle EXIF metadata for array data type. But theoretically, an EXIF key is only allowed to be given one value, so it would work like this:

>>> img.modify_exif({'Exif.Image.Orientation': ['1', '2']})
>>> img.read_exif()['Exif.Image.Orientation']
'2'

It's equivalent to executing this:

>>> img.modify_exif({'Exif.Image.Orientation': '1'})
>>> img.modify_exif({'Exif.Image.Orientation': '2'})
>>> img.read_exif()['Exif.Image.Orientation']
'2'
molexx commented 5 months ago

Thank you.

According to the exiv2 docs, duplicate EXIF tags are allowed within the specification: https://exiv2.org/manpage.html#multi_tags.

Could/should modify_exif() create duplicate tags in this scenario?

I would prefer that pyexiv2_imagedata_target.modify_exif(pyexiv2_imagedata.read_exif()) recreated the original data as much as possible.

LeoHsiao1 commented 5 months ago

Thanks for finding this document, I'll try to see if pyexiv2 can write the duplicate EXIF key.

LeoHsiao1 commented 5 months ago

I implemented this feature, repeatedly write a key through a for loop: https://github.com/LeoHsiao1/pyexiv2/blob/ff8c4c4fbbe9988eb1c738628088f65f3614fec3/pyexiv2/lib/exiv2api.cpp#L261-L270

The effect is as follows:

>>> img.modify_exif({'Exif.Image.Orientation': ['1', '2', '3']})
>>> img.read_exif()['Exif.Image.Orientation']
['1', '2', '3']

But it looks weird, duplicate keys may not make sense and won't be recognized by the software. I've found that read_exif() converts duplicate keys to python's list type and may change the order of the list's elements, not achieving an exact copy. So I want to add some new functions to copying metadata from one image to another image, according to https://exiv2.org/doc/metacopy_8cpp-example.html

LeoHsiao1 commented 5 months ago

For example, the exif metadata of an image may be in the wrong format and cannot be read and parsed, but it can still be copied to another image.

LeoHsiao1 commented 5 months ago

I implemented a function that is good at copying, next I need to add some unit tests. https://github.com/LeoHsiao1/pyexiv2/commit/bb0b72b1bbc292842a795fcefc509bbfb38a230f

LeoHsiao1 commented 5 months ago

I just released v2.12.0 Please see the tutorial

molexx commented 5 months ago

This is great, thank you! 2.12 is working good for me.