Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
914 stars 279 forks source link

libexiv2 saves a part file when attempting to export data that conforms with the multi-segment xmp standard. #908

Closed 8i8 closed 5 years ago

8i8 commented 5 years ago

Describe the bug When exporting a jpeg image from a raw file in the darktable program, the program calls obj->writeMetadata(); A part file is saved in the directory along with the programs output. Generating the following error:

caught exiv2 exception 'Size of XMP JPEG segment is larger than 65535 bytes'

To Reproduce Steps to reproduce the behaviour: Create an image to export in darktable that uses many instances of the spot removal tool, generating many masks; Export that file to jpeg.

Expected behavior A single jpeg file should be saved.

Desktop (please complete the following information):

Additional context Originating raw file is a canon tif from canon 1Ds.

clanmills commented 5 years ago

@8i8 Is this in the darktable code? There's no such code in Exiv2:

731 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance $ grep 'obj.*write' src/*pp
src/datasets.cpp:                N_("Contains name of the creator of the object data, e.g. writer, photographer "
732 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance $ 

There is a problem with JPEG when the Exif metadata size > 64k. You are prone to this when converting TIFF (which supports > 64k of Exif metadata) to a JPEG. According to the Exif specification, all Exif data must reside in a single "chunk". A "chunk" cannot exceed 64k because JPEG is a 1992/16-bit file format. Adobe have proposed an extension to JPEG (and implemented it in their products). https://dev.exiv2.org/issues/1232

Phil Harvey (of ExifTool) has supplied me with a test file that uses Adobe's convention. Support for this is on the TODO list for Exiv2, however it's low in our priorities because Phil said "I have never actually encountered one of those files 'in the wild'".

Applications have to delete metadata to ensure it fits within 64k. This can be achieved within a loop that catches the thrown exception and has been documented in another bug report.

I'd like you to discuss this with the darktable engineers. I would like to close this as "working correctly, not a bug" because Exif metadata in JPEG !> 64k according to the specification.

8i8 commented 5 years ago

Hello Robin, Thank you for your concise reply. It looks as though this is the instance of writeMetadata() that is being called by an image object in darktable: https://github.com/Exiv2/exiv2/blob/620e0a98cb970cc5ff0d76523baf4649518ae42d/src/jpgimage.cpp#L896 I am investigating ways to stop this in darktable, it would not have been noticed as being an issue if the file was not partially written. I see that there is a large section, section 5 in WORK-IN-PROGRESS and that writeMetadata() gets a mention there, it would seem that the tiff format is not entirely clear. I would love to take a look at some c++ code but am not at all familiar with it yet.

clanmills commented 5 years ago

Where did you find WORK-IN-PROGRESS? It's not a secret file. We don't have secrets in Open Source. However, its purpose is to share knowledge between team members about active projects. About 40% of Exiv2 was written by Andreas, the project founder. In Exiv2 v0.27, we had a major exploration of the TiffVisitor code which isn't easy to understand. I now understand that code, so WORK-IN-PROGRESS was deleted and isn't in the release.

Exif data > 64k is a violation of the Exif/JPEG spec. It's been on the "wish list" for a while to implement Adobe's work-around architecture.

Exiv2 is doing the correct thing by throwing an exception and darktable must catch the exception and clean-up the file system.

I am willing to help you to develop your C++ skills. TiffVisitor is tough/ingenious code which you'll find very challenging. If you'd like to join the team, perhaps we could work together on the issue to fix Exif data > 64k.

I'm going to close this issue as I can't see any reason to keep it open. If you and/or the darktable engineers wish to discuss this, I'll reopen this for discussion for v0.27.3. I'm very focused today on releasing Exiv2 v0.27.2-RC1.

8i8 commented 5 years ago

Thank you very much for your kind offer, I am thinking about this very seriously as I would like also to understand the code and the tiff format, to my advantage, It would be a real pleasure to help if I am able to. The WORK-IN-PROGRESS file popped up after I cloned the repo and grepped writeMetadata() looking for the API entry point that darktable is calling.

clanmills commented 5 years ago

From your name, I think you're a country man of mine. https://clanmills.com Let's talk on Skype next week. 'clanmills'. Email to agree a time: robin@clanmills.com

We're going to Normandy tomorrow (Friday) to run in the D-Day Half Marathon on Sunday. We get home on Tuesday evening. Exiv2 v0.27.2-RC1 is scheduled for 2019-06-15, however I want it "out the door" this afternoon.