hMatoba / Piexif

Exif manipulation with pure python script.
MIT License
367 stars 81 forks source link

Reading and writing exif values does not perserve values #51

Closed orangelynx closed 6 years ago

orangelynx commented 6 years ago

When using the library to read exif data from an image (using load on a jpg), and rewriting the loaded exif data back to the image (using insert), not all exif values remain unchanged (as they should be).

For example, this example code lead (among others) to exif subsecond information being lost (apparently because it cannot be parsed correctly) and subsequently changed some offset values.

exif = piexif.load("sample.jpg")
piexif.insert(piexif.dump(exif), "sample.jpg")

should definitely be fixed.

hMatoba commented 6 years ago

There is roundtrip test(dump, load). https://github.com/hMatoba/Piexif/blob/master/tests/s_test.py#L236

Could you give me the "sample.jpg"?

orangelynx commented 6 years ago

My apologies, most of the behaviour was actually my fault, due to a rather concealed bug.

However some information (Subsec Fields and consequently some offset fields) are not preserved in this image taken with a Samsung Galaxy S8:

sample.zip

i have updated the issue to reflect that.

PS: uploading the img directly to github also appears to change the exif information.

hMatoba commented 6 years ago
from PIL import Image
import piexif

e = piexif.load("sample.jpg")
e_bytes = piexif.dump(e)
piexif.insert(e_bytes, "insert.jpg")

insert <- output:"insert.jpg"

orangelynx commented 6 years ago

is the output image supposed to be a small black square?

hMatoba commented 6 years ago

sample.jpg http://exif.regex.info/exif.cgi

Exif Image Size 4,032 × 3,024
Make samsung
Camera Model Name SM-G950F
Software G950FXXU1AQK7
Modify Date 2017:12:12 12:14:301 month, 24 days, 13 hours, 7 minutes, 37 seconds ago
Y Cb Cr Positioning Centered
Exposure Time 1/356
F Number 1.70
Exposure Program Program AE
ISO 40
Exif Version 0220
Date/Time Original 2017:12:12 12:14:301 month, 24 days, 13 hours, 7 minutes, 37 seconds ago
Create Date 2017:12:12 12:14:301 month, 24 days, 13 hours, 7 minutes, 37 seconds ago
Components Configuration Y, Cb, Cr, -
Shutter Speed Value 1/357
Aperture Value 1.70
Brightness Value 6.16
Exposure Compensation 0
Max Aperture Value 1.7
Metering Mode Center-weighted average
Flash No Flash
Focal Length 4.2 mm
Image Size 504 × 376
Maker Note Unknown (98 bytes binary data)
User Comment  
Sub Sec Time 0,200
Sub Sec Time Original 0,200
Sub Sec Time Digitized 0,200
Flashpix Version 0100
Color Space sRGB
Interoperability Index R98 - DCF basic file (sRGB)
Interoperability Version 0100
Exposure Mode Auto
White Balance Auto
Focal Length In 35mm Format 26 mm
Scene Capture Type Standard
Image Unique ID F12LLJA00VM F12LLKG01GM
Compression JPEG (old-style)
Orientation Rotate 90 CW
Resolution 72 pixels/inch
Thumbnail Length 36,099
Thumbnail Image (36,099 bytes binary data)
hMatoba commented 6 years ago

insert.jpg http://exif.regex.info/exif.cgi

Make samsung
Camera Model Name SM-G950F
Software G950FXXU1AQK7
Modify Date 2017:12:12 12:14:301 month, 24 days, 13 hours, 20 minutes, 11 seconds ago
Y Cb Cr Positioning Centered
Exposure Time 1/356
F Number 1.70
Exposure Program Program AE
ISO 40
Exif Version 0220
Date/Time Original 2017:12:12 12:14:301 month, 24 days, 13 hours, 20 minutes, 11 seconds ago
Create Date 2017:12:12 12:14:301 month, 24 days, 13 hours, 20 minutes, 11 seconds ago
Components Configuration Y, Cb, Cr, -
Shutter Speed Value 1/357
Aperture Value 1.70
Brightness Value 6.16
Exif Image Size 4,032 × 3,024
Exposure Compensation 0
Max Aperture Value 1.7
Metering Mode Center-weighted average
Flash No Flash
Focal Length 4.2 mm
Image Size 504 × 376
Maker Note Unknown (98 bytes binary data)
User Comment  
Sub Sec Time 0,200
Sub Sec Time Original 0,200
Sub Sec Time Digitized 0,200
Flashpix Version 0100
Color Space sRGB
Exposure Mode Auto
White Balance Auto
Focal Length In 35mm Format 26 mm
Scene Capture Type Standard
Image Unique ID F12LLJA00VM F12LLKG01GM
Interoperability Index R98 - DCF basic file (sRGB)
Compression JPEG (old-style)
Orientation Rotate 90 CW
Resolution 72 pixels/inch
Thumbnail Length 36,099
Thumbnail Image (36,099 bytes binary data)
hMatoba commented 6 years ago

Makernote

sample.jpg

Unknown 0x0001 | 0,100 Unknown 0x0002 | 73,728 Unknown 0x000c | 0 Unknown 0x0010 | undef Unknown 0x0040 | 0 Unknown 0x0050 | 1 Unknown 0x0100 | 0 Time Stamp | 2017:12:11 19:14:30-08:001 month, 25 days, 6 hours, 7 minutes, 37 seconds ago Samsung Trailer 0x0aa1 Name | MCC_Data Samsung Trailer 0x0aa1 | (3 bytes binary data)

insert.jpg

Unknown 0x0001 | 0,100 Unknown 0x0002 | 73,728 Unknown 0x000c | 0 Unknown 0x0010 | undef Unknown 0x0040 | 0 Unknown 0x0050 | 1 Unknown 0x0100 | 0

hMatoba commented 6 years ago

Black square is the output.

hMatoba commented 6 years ago

sample.jpg piexif.load

Exif 36864: b'0220' 37121: b'\x01\x02\x03\x00' 37378: (153, 100) 36867: b'2017:12:12 12:14:30' 36868: b'2017:12:12 12:14:30' 37381: (153, 100) 37510: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' 37383: 2 37385: 0 37386: (420, 100) 40962: 4032 37520: b'0200' 37521: b'0200' 37522: b'0200' 37379: (616, 100) 41986: 0 41989: 26 37380: (0, 10) 33434: (1, 356) 33437: (17, 10) 40965: 840 40963: 3024 34850: 2 40961: 1 40960: b'0100' 34855: 40 41987: 0 41990: 0 42016: b'F12LLJA00VM F12LLKG01GM\n' 37377: (848, 100) 37500: b'\x07\x00\x01\x00\x07\x00\x04\x00\x00\x000100\x02\x00\x04\x00\x01\x00\x00\x00\x00 \x01\x00\x0c\x00\x04\x00\x01\x00\x00\x00\x00\x00\x00\x00\x10\x00\x05\x00\x01\x00\x00\x00Z\x00\x00\x00@\x00\x04\x00\x01\x00\x00\x00\x00\x00\x00\x00P\x00\x04\x00\x01\x00\x00\x00\x01\x00\x00\x00\x00\x01\x03\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' 1st 256: 504 257: 376 274: 6 259: 6 513: 1000 296: 2 282: (72, 1) 283: (72, 1) 514: 36099 0th 272: b'SM-G950F' 305: b'G950FXXU1AQK7' 274: 6 531: 1 296: 2 34665: 202 282: (72, 1) 283: (72, 1) 306: b'2017:12:12 12:14:30' 271: b'samsung' GPS Interop 1: b'R98'

hMatoba commented 6 years ago

insert.jpg piexif.load

Exif 36864: b'0220' 37121: b'\x01\x02\x03\x00' 37378: (153, 100) 36867: b'2017:12:12 12:14:30' 36868: b'2017:12:12 12:14:30' 37381: (153, 100) 37510: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' 37383: 2 37385: 0 37386: (420, 100) 40962: 4032 37520: b'0200' 37521: b'0200' 37522: b'0200' 37379: (616, 100) 41986: 0 37380: (0, 10) 33434: (1, 356) 40965: 830 33437: (17, 10) 41989: 26 40963: 3024 34850: 2 40961: 1 40960: b'0100' 34855: 40 41987: 0 41990: 0 42016: b'F12LLJA00VM F12LLKG01GM\n' 37377: (848, 100) 37500: b'\x07\x00\x01\x00\x07\x00\x04\x00\x00\x000100\x02\x00\x04\x00\x01\x00\x00\x00\x00 \x01\x00\x0c\x00\x04\x00\x01\x00\x00\x00\x00\x00\x00\x00\x10\x00\x05\x00\x01\x00\x00\x00Z\x00\x00\x00@\x00\x04\x00\x01\x00\x00\x00\x00\x00\x00\x00P\x00\x04\x00\x01\x00\x00\x00\x01\x00\x00\x00\x00\x01\x03\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' 1st 256: 504 257: 376 274: 6 259: 6 513: 974 296: 2 282: (72, 1) 283: (72, 1) 514: 36099 0th 272: b'SM-G950F' 305: b'G950FXXU1AQK7' 274: 6 531: 1 296: 2 34665: 201 282: (72, 1) 283: (72, 1) 306: b'2017:12:12 12:14:30' 271: b'samsung' GPS Interop 1: b'R98'

orangelynx commented 6 years ago

interesting. For exif inspection I have used irfanview so far, so possibly a bug with irfanview:

comp

the regex.info exif tool also shows a change in endianness between input and output. I'll do a binary comparsion when I have time.

hMatoba commented 6 years ago

InteroperabilityOffset value is changed, on the one hand Interoperability Index is not changed. Some offset values(also ExifOffset) are just pointer.

orangelynx commented 6 years ago

I know, however it indicates some change in the exif data, and I'd like to know exactly what's happening before running it over my entire photo collection.

The endianness change appears to be correct and stems from a difference between endianness of the camera and the PC.

hMatoba commented 6 years ago

piexif.load: both big endain and little endian piexif.dump: only big endain

orangelynx commented 6 years ago

I used the exiv2 utility (afaik one of the most accurate implementations) to analyse the input and the output:

My findings:

PS E:> exiv2 -pt insert.jpg >> insert.txt Warning: Directory Photo has an unexpected next pointer; ignored. Warning: Directory Iop has an unexpected next pointer; ignored.

exiv2 output files for use in diff-tool: insert_raw.txt sample_raw.txt

orangelynx commented 6 years ago

Ok, starting to get to the bottom of this.

The whole thing boils down to this: Depending on which InteroperabilityIndex is present (frequently that is R98), additional fields exist in the IOP IFD, as specified here:

http://www.exif.org/dcf.PDF

this includes the field InteroperabilityVersion, which defines the version of the R98 standard (currently 1.0), among other fields.

To avoid having to implement all possible additional fields that may be present given different interoperability standards, piexif should retain unknown fields using bare data and not discard them.

hMatoba commented 6 years ago

It is high cost to implement to keep unknown fields. It breaks consistency. I won't do that.

orangelynx commented 6 years ago

Your choice, but I beg to differ. Ensuring that exif data can be modified even if it contains optional or even non-standard fields should be your highest priority. Not only is this the way it is recommended in Appendix E3 in the official Exif standard, but this is one of the basic principles of file modification.

I'm not sure if you refer to UNKNOWN Type fields or just just mean fields that are not documented in the main standard pdf. either way piexif should be able to handle this.

hMatoba commented 6 years ago

Add unknown tag to _exif.py. If some tags are written in reliable document and not in _exif.py, I'll do so.

hMatoba commented 6 years ago

If there is any tag that is not documented, piexif aviods to handle it.

orangelynx commented 6 years ago

piexif doesn't have to support to modify unknown / undocumented fields, however should at least remember that they exist and ensure that they get written back to the file unchanged in some way. You don't have to expose those through the piexif API.