darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.74k stars 1.14k forks source link

darktable produses strange side files *.jpg13668 at export #3542

Closed BigSerpent closed 3 years ago

BigSerpent commented 4 years ago

Darktable sometimes produces (or leaves) strange small files in a export directory at exporting jpegs. DSC04003.zip

jade-nl commented 4 years ago

Came here to report this issue and found it was already reported. Here are my details:

This is the second time this happened to me and looking at the exported jpg's I don't see any issue's with the picture itself. The embedded xmp data is missing however. The ICC and EXIF data is still embedded. Both times a second file was created, example: slw_5185.jpg8733

This is what I see in the terminal when this happens:

[xmp_attach] /home/jade/Downloads/slw_5185.jpg: caught exiv2 exception 'Size of XMP JPEG segment is larger than 65535 bytes' [export_job] exported to `/home/jade/Downloads/slw_5185.jpg'

The number added to the second created file isn't unique, I immediately (re)saved my other edit that has this issue and it shows up as (same dt session, no restart): slw_3999.jpg8733

The first time this happened might have been a shot that was also first edited with 3.0rc0, the second one was a freshly imported, full 3.0rc1 edit.

I did multiple edits on raw files from the same film-roll and these are the only two that ended up having this issue.

The release I'm using is build with _CMAKE_BUILDTYPE="RelWithDebInfo" but no extra info was generated.

dt: 3.0.0rc1+84~g801c58d80 (and a previous rc1 version for the first time) OS: Linux Distro: Debian 10.2

A copy of both the extra created and the xmp files: xmp.and.extra.files.tar.gz

If any extra info is wanted let me now.

jenshannoschwalm commented 4 years ago

Just a guess, it might be your original raw files have very large exif data that could spoil the jpeg writing party. Could you pass the original raw plus xmp file please?

junkyardsparkle commented 4 years ago

Earlier discussion about this here: https://dev.exiv2.org/boards/3/topics/1631

junkyardsparkle commented 4 years ago

If the large data is from masking in darktable, a workaround would be to disable the exporting of the darktable history in the metadata settings of the export module. There's also an option in settings "store XMP tags in compressed format"... if it's been set to "never", then changing the setting might help.

jade-nl commented 4 years ago

Just a guess, it might be your original raw files have very large exif data that could spoil the jpeg writing party. Could you pass the original raw plus xmp file please?

Here is one of the images and the accompanying xmp: slw_5185.tar.gz

jade-nl commented 4 years ago

If the large data is from masking in darktable, a workaround would be to disable the exporting of the darktable history in the metadata settings of the export module. There's also a an option in settings "store XMP tags in compressed format"... if it's been set to "never", then changing the setting might help.

I have the Store XMP tags... set to never and if I change it to only large entries no extra files are created but the message in the terminal is still there:

[xmp_attach] /home/jade/Downloads/slw_5185.jpg: caught exiv2 exception 'Size of XMP JPEG segment is larger than 65535 bytes'

Setting the Store XMP tags... to anything other then never seems to be a workaround for a bigger, non darktable related problem (if I interpret the exiv2 link you provided correctly).

However: I am not able to reproduce this in dt 2.6.3. I can't use the xmp created by dt v3rc1 so I just made 10 or so "random"edits using (multiple instances of) tone curve, highpass, equaliser and exposure, all using drawn and/or parametric masks. No extra file is created (the Store XMP tags... is set to never, I did check that).

EDIT: Removed a stray word.

jenshannoschwalm commented 4 years ago

We could have large exif tags from the raw file not stripped off before writing (special makenotes ...) before writing. So we have to check the raw

junkyardsparkle commented 4 years ago

I have the Store XMP tags... set to never and if I change it to only large entries no extra files are created but the message in the terminal is still there:

That's odd. Maybe @upegelow can comment on how that's supposed to work.

However: I am not able to reproduce this in dt 2.6.3. I can't use the xmp created by dt v3rc1 so I just made 10 or so "random"edits using (multiple instances of) tone curve, highpass, equaliser and exposure, all using drawn and/or parametric masks. No extra file is created (the Store XMP tags... is set to never, I did check that).

Specifically, drawn masks with many nodes will create the large chunks of data. It can be interesting to look at those XMP files in a text editor, if you haven't yet. :-)

jade-nl commented 4 years ago

Specifically, drawn masks with many nodes will create the large chunks of data. It can be interesting to look at those XMP files in a text editor, if you haven't yet. :-)

I just did a more conscientious edit of slw_5185 using 2.6.3 and I cannot get it to trip the XMP 65535 limit. 20 steps (not counting the "hidden" default ones), 18 of those are drawn (very precise=lots of nodes) and parametric (each with different settings from the previous instance). The size of the XMP file I end up with is just a tad above 23000.

The v3.0RC1 edit is 19 steps (including the 7 default ones) and only 3 of those steps use elaborate drawn+parametric. That XMP is a tad over 70000........

I did look inside the XMP's but I lack the knowledge to determine if the entries are(too) big or normal in size or if anything else is wrong with it. For those that do have that knowledge: The 2 xmp's that I have this issue with are included in the xmp.and.extra.files.tar.gz I posted in my first reply.

junkyardsparkle commented 4 years ago

I just did a more conscientious edit of slw_5185 using 2.6.3 and I cannot get it to trip the XMP 65535 limit. 20 steps (not counting the "hidden" default ones), 18 of those are drawn (very precise=lots of nodes) and parametric (each with different settings from the previous instance). The size of the XMP file I end up with is just a tad above 23000.

Interesting. If you import that image/sidecar into 3.0 and then export the resulting XMP, is that enough to "inflate" it? If not, what if you bump a node in each shape after importing? I wonder if it's related to the new "raster masks", but I don't know much about that... maybe @aurelienpierre does.

junkyardsparkle commented 4 years ago

Ok, I just did a quick test myself: created a mask in 2.6.3 consisting of one very long maze-like brush stroke, exported the XMP, then imported into 3.0 and exported XMP again. The sidecar from 3.0 was double the size, and looking inside shows that the mask is stored twice. So that explains the difference in versions. I don't know if that's a bug by itself, but it obviously makes it much easier to trip the size limit as described in this ticket.

jenshannoschwalm commented 4 years ago

I would need further advice for proper testing.

I took slw_5185.tar.gz, used that on current master and also checked the exif data in the original - nothing unusual that would require to be removed. Also tested with the xmp from xmp.and.extra.files.tar.gz

While exporting there is the pretty large icc profile (~20k) included and the history stuff with masks but that's also ok.

What exactly am i supposed to do to reproduce the problem with missing xmp data or corrupted files? (Global export options and edit metadata exportation)

junkyardsparkle commented 4 years ago

@jenshannoschwalm I think this issue reduces to the fact that the masks are now stored twice in the XMP files.This becomes significant for large masking data vs. the size limit.

jade-nl commented 4 years ago

What exactly am i supposed to do to reproduce the problem with missing xmp data or corrupted files? (Global export options and edit metadata exportation)

I just tried using the latest version:

$ .local/darktable/bin/darktable --version this is darktable 3.0.0rc2+6~gc4c43ae61

I downloaded and extracted the xmp.and.extra.files.tar.gz and slw_5185.tar.gz Iuploaded in previous replies.

Start darktable, import slw_5185.nef, go from darkroom view to lighttable view, export slw_5185 to disk. That is all you need to do.

Take a look at the terminal you started dt from, the error will be there. There will also be an extra file at the location you exported to. And there is no xmp data in the exported jpg.

Mind you, none of these files are corrupt. "Not complete" might be the correct phrasing for the jpg file.

jade-nl commented 4 years ago

@junkyardsparkle I haven't been able to confirm the duplicate storage you are seeing using your method.

The xmp file is almost twice as big and a lot of extra data is present in the v3.0 version, but I only see the 'big block' (designated as brush #1 in my xmp) once.

junkyardsparkle commented 4 years ago

Well for whatever it's worth, here's the XMP exported from current master in my test described above; look at lines 25 and 52. That's as much as I'm going to dig into this can of worms right now, sorry if it's not worth much. :-)

f7237773.orf.xmp.3.0.txt

jenshannoschwalm commented 4 years ago

Well your xmp shows mask_id twice - but for different mask_num (the mask manager owns it too). Also the size of the xmp file is not restricted. So that are both no "bugs" or real issues.

@jade-nl i tried what you described and can not reproduce. Thats why i asked for the export options or exact parameters while exporting. This is required to understand why we have an extra file...

jade-nl commented 4 years ago

@jenshannoschwalm

i tried what you described and can not reproduce. Thats why i asked for the export options or exact parameters while exporting. This is required to understand why we have an extra file...

Both of these produce the described problem:

export selected

scenario 1 file format -> JPEG (bit) quality -> 98 global -> 0x0 + no + yes + sRGB (web-safe) + image settings + none + replace history

scenario 2 file format -> JPEG (bit) quality -> 93 global -> 2560x0 + no + yes + sRGB (web-safe) + image settings + none + replace history

The metadata exportation options are the same for both scenarios: exif data checked metadata checked geo tags checked tags checked private tags unchecked synonyms unchecked omit hierarchical tags unchecked develop history checked

I also have changed entries in the darktable preferences, here are the relevant ones and their values: GUI

CORE

I did change other options but I don't think those are relevant, but just in case here are both darktablerc files. darktablerc.tar.gz

junkyardsparkle commented 4 years ago

Might also mention your exiv2 version? FWIW, I have no trouble reproducing the error message and the lack of XMP data in the JPEG (this is the actual issue, right?), but still haven't seen any of those *jpgnnnnn files (these are just a "lucky" symptom that kept the issue from going unnoticed in your case, right?).

I haven't been able to confirm the duplicate storage you are seeing using your method.

Indeed, I was too quick to assume that was also what was happening in your case. Instead, it looks like another thing that can contribute to hitting the size limit. If there's still no way around the size limit, then the only "fix" would be to try not to get there. Maybe exporting history in JPEGs should be off by default, now that it's optional? I think the compression is already enabled by default ("only large entries").

jade-nl commented 4 years ago

Might also mention your exiv2 version?

$ exiv2 --version exiv2 0.25 001900 (64 bit build)

If there's still no way around the size limit, then the only "fix" would be to try not to get there.

There is a workaround that "fixes" the problem, as mentioned in a previous reply: Do not set the Store XMP tags... option to never Both the other 2 options can be used without the problem showing up.

I do believe that the default setting for this option is always

To be honest I don't think we need to spend much more time on this, at least not with the current rc's, the buildup to 3.0 and all the "critical" issues that accompany it. Maybe come back to it after things settle down and we have an official release, assuming it is still an issue at that time.

PS: @BigSerpent could you, as initiator of this issue, also post the info that was asked for in the followups?

junkyardsparkle commented 4 years ago

exiv2 0.25 001900 (64 bit build)

That's fairly old, so as a wild guess, it could be the reason you see the extra files but others don't.

To be honest I don't think we need to spend much more time on this, at least not with the current rc's, the buildup to 3.0 and all the "critical" issues that accompany it.

What makes it slightly alarming to me is that it's now far more likely that a user could reach this critical threshold for XMP data size via editing history, and the result is, in my tests at least, to abort the addition of the entire XMP metadata, including keywords, etc. This doesn't happen "silently" if you happen to be watching the console for errors, but for all practical purposes there's no warning to users when this happens (except the lucky ones who get those weird binary files).

jade-nl commented 4 years ago
exiv2 0.25 001900 (64 bit build)

That's fairly old, so as a wild guess, it could be the reason you see the extra files but others don't.

It is part of the latest Debian (Buster - 10.2) So I don't think this is the problem. I'm also assuming that, if exiv2 was the problem it would have shown up when using v2.6.X (which runs on the exact same machine as my v3.0RCX).

I fully agree with you about the 3.0 version creating bigger xmp's and is thus more prone to reach this 2^16 size limit.

junkyardsparkle commented 4 years ago

Well, exiv2 0.25 was released 4.5 years ago... each version seems to have its own quirks, so I don't know what the maintainers are weighing in that regard... but it could be the reason for seeing the extra files or not. (I only moved to the 0.27 branch fairly recently myself, because I got sick of the extra garbage in the output for *.orf files.)

I'm also assuming that, if exiv2 was the problem it would have shown up when using v2.6.X (which runs on the exact same machine as my v3.0RCX).

But did you ever manage to hit the critical size (and see the console message) with 2.6? I didn't, even with the amount of "bloated" masking I did for my test.

BigSerpent commented 4 years ago

The behavior of git master is strange now. Darktable uses my base curve and input color matrix (applied automatically at import). If I import an image and preview it in lightable mode (W), the xmp size is 1.1 k (DSC04716_03.ARW.xmp). The icon shows the history stack exists and is correct. If I open the same image in the darkroom mode and close it without modifications, the size of xmp file jumps to 14k (DSC04716_02.ARW.xmp). In case of some modifications the size grows quickly as it can be seen in DSC04716.ARW.xmp and DSC04716_01.ARW.xmp respectively. Here is a link to the file https://yadi.sk/d/aBGzFgwe-SGBUw

I used to have the problem with side files and broken exif in 2.x too, but it was a rear case.

jade-nl commented 4 years ago

---EDIT--- ### The below info is incorrect. Have a look here for the correct info

@junkyardsparkle

To make sure that my "old" exiv2 isn't the culprit I cloned exiv2, build it and installed a local copy. And I rebuild and installed dt using that copy.

$  .local/exiv2/bin/exiv2 --version
exiv2 0.27.99.0
$ .local/darktable/bin/darktable --version
this is darktable 3.0.0rc2+14~g35d90efcd

I still end up with an extra file, an xmp-less jpg and a warning in the terminal. ---END EDIT---

But did you ever manage to hit the critical size (and see the console message) with 2.6? I didn't, even with the amount of "bloated" masking I did for my test.

Besides the earlier mentioned attempt I haven't tried again. The result of 20 steps was a meager 23k xmp file. I pretty sure that if you do enough (bogus?) edits that you will eventually reach a t(r)ipping point.

junkyardsparkle commented 4 years ago

I pretty sure that if you do enough (bogus?) edits that you will eventually reach a t(r)ipping point.

Yep, if darktable doesn't crash first. ;-)

Thanks for testing my theory! If the exiv2 version doesn't explain why some of us aren't seeing the extra files along with the failed XMP, then I don't know. Looking at yours, they appear to be the JFIF headers for the JPEG file that gets aborted when the XMP doesn't fit (and not being cleaned up for some reason). Maybe I'll try to catch one again later, but they aren't really the issue anyway.

jenshannoschwalm commented 4 years ago

It's not that difficult ...

  1. The xmp files can be as long as we "want" them to be. The masks and such are exported for the owning module itself and for the mask manager. Seems to be necessary for shared masks atm.

  2. Export of exif data to a jpeg is restricted in length. If the blob "to be written" is found to be incorrect in a way (in your case too large) the exiv2 library call generates the observed error message and stops writing the blob. The bogus file is an exiv2 problem.

  3. When importing a fresh image into dt and you apply a basecurve that data is stored in the sidecar, so that grows immediately.

junkyardsparkle commented 4 years ago

Export of exif data to a jpeg is restricted in length. If the blob "to be written" is found to be incorrect in a way (in your case too large) the exiv2 library call generates the observed error message and stops writing the blob.

Right... and it would be nice to make sure the user is alerted to this failure, now that it's more likely to happen under real-world use cases.

jade-nl commented 4 years ago

I have to retract my previous statement: The extra file is not created with the newest exiv2 version (missing xmp data and terminal message is still the same).

I'm not sure why the extra file was created the first time around. The only difference between this current build of 0.27.99.0 and the previous one: I installed and enabled a few optional dependencies (no complains about those in the first build). This isn't because of the newer dt version either, I immediately build dt with the older exiv2 version and that one does create the extra file.

So, yes, this is an exiv2 issue (as previously stated by both junkyardsparkle and jenshannoschwalm).

.... and it would be nice to make sure the user is alerted to this failure, now that it's more likely to happen under real-world use cases.

I fully agree. I see a roughly doubling in size of the xmp files if I compare 2.6 and 3.0. At a certain point even compressing from within dt will reach its limits and the exiv2 limit behaviour will be triggered regardless. Users need to be alerted to that.

github-actions[bot] commented 4 years ago

This issue did not get any activity in the past 30 days and will be automatically closed in 7 days if no update occurs. Please check if the master branch has fixed it since then.

rth-dev commented 4 years ago

the user got the 'side file' issue ; ) so i've git clone exiv2 sources and installed them as written in the readme.md. in addition i had to re-build darktable so that libexiv2.so.27 is used. the side file doesn't appear anymore, but darktable tags are not written into the exported jpg. xmp compression didn't helped here.

might it be possible to exclude e.g. larger history, just not to loose all?

Nilvus commented 3 years ago

Issue fixed, I close. For last questions, tags could be added by checking them in export module preferences. The history could be compressed (if I had understood the question). But maybe this is no more needed as the issue is old.