darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.11k stars 1.11k forks source link

The option to write an xmp sidecar file "after edit" has unwanted side effects that interfere with file backup and -transfer #16676

Open bertwim opened 2 months ago

bertwim commented 2 months ago

Describe the bug

When writing xmp sidecars "after edit', DT writes a new xmp file after an image has been edited in Darkroom. From the code (in image.c) I understand that DT determines from a hash if an image has been changed. This seems to work ok.

The issue occurs after DT has been closed and reopened. In that case, DT might also unnecessarily rewrite other xmp sidecar files. I say 'might', because the precise behaviour seems a bit erratic. The issue is with the timestamps.

When the 'other files' are rewritten, they have new timestamps, thus appearing to be new files whereas they are only rewrites of their originals. A loss of valuable information that pollutes any (incremental) backup that is made.
Also, as I have to work from different workstations that are kept in sync with rsync, there is constant confusion which files are now being really changed and which not.

The problem is complete when the DT-datebase is cleared first (hence automatically rebuilt with the xmp files): In that case all xmp are rewritten, with all having the same timestamp! DT should never change the meta data in this case.

The desired behaviour should be, in my opinion:

Regards Bertwim

Steps to reproduce

  1. Go to any directory that has images and open DT: cd ......; darktable . &
  2. do a 'ls -l *xmp" to list all xmp-files. Check the timestamps.
  3. Edit one or more images and list the xmp files again
  4. Close DT. Then reopen DT again (darktable . &)
  5. Carefully compare the timestamps fo the xmp files.
  6. Close DT
  7. Remove the database (e.g. ~/.config/darktable/library*db) and cache (e.g. ~/.cache/darktable)
  8. Reopen DT and again compare timestamps of the xmp files.

Expected behavior

Logfile | Screenshot | Screencast

No response

Commit

No response

Where did you obtain darktable from?

GitHub nightly

darktable version

darktable 4.7.0+1000~gd05ae79e9b

What OS are you using?

Linux

What is the version of your OS?

OpenSUSE 15.5

Describe your system?

No response

Are you using OpenCL GPU in darktable?

Yes

If yes, what is the GPU card and driver?

Irrelvent for this problem.

Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

Problem has been in DT for many years, I think. Personally, I have made a local fix that completely avoids the issue.

ralfbrown commented 2 months ago

dt also updates sidecars when image metadata changes (e.g. tags, stars, color labels, title, etc.) so that you don't lose that information if the database is lost.

In the particular case of importing into a fresh library, the "import time" metadata changes, which is triggering the rewrite.

victoryforce commented 2 months ago

In the particular case of importing into a fresh library, the "import time" metadata changes, which is triggering the rewrite.

@bertwim Is this explanation satisfactory for you? Can we close this issue?

bertwim commented 2 months ago

Please do not close this issue! It is a bug. Explaining why the behaviour is what it is does not mean there is no mistake! Changing the timestamp when nothing has changed is a bug. It gives misleading information and it interferes with backup-schemes.

kmilos commented 2 months ago

when nothing has changed

It was explained this assumption is maybe not true - "edit" here does not mean literal edit in darkroom - it could be different import timestamp, change of rating/label in lighttable, etc., basically any aspect of managing your asset.

Do you have another example when really nothing has changed and this occurs? Your example is basically equivalent to moving the raw file + sidecar to another machine, importing, and expecting the import timestamp to be the original one on the first machine. It just doesn't work that way currently.

bertwim commented 2 months ago

I think the basic flaw is that DT has a certain conception of what it means that a picture has been edited. It compares hash values. When the hash has changed, it interprets this as the picture has changed and it writes xmp files accordingly. This conception -evolved as it may seem-, however, is not always correct. When sufficient information is lacking DT is taking an action that it just should not do, i.e. writing new files with new timestamps. As explained several times, these new timestamps spoil a proper backup.

In my local copy of DT, I have resolved this issue by a very simple fix, of just a few lines: in an additional option, an xmp file is written only when it is being asked for. In my case, I found that writing an xmp automatically when I export an jpg-file (and only then: no automagically writing of xmp)- is exactly what I needed. I can immediately see which files have been edited. Timestamps are always meaningful. No unforeseen or unsuspected writing of XMP. I have tried to get this option into DT but it was denied. I never under stood why. It was alleged that this is not a correct way of doing but, I do not agree. Maybe this does not fit other peoples workflow, but then just don't choose that option. For me it has worked for many years.

Regards, Bertwim

wpferguson commented 2 months ago

The XMP file in darktable is meant to be a backup file for the database, so the information in it is what's required to rebuild the database. In that case the definition of edit is a change to the database, not necessarily a change to the image.

In the rebuild the database from XMP files scenario, no database exists so darktable can't tell if the data in the XMP files has changed, so it has to rewrite the XMP files to make them match the database.

bertwim commented 2 months ago

Well, everybody is busy explaining how DT works. But that is not the point. I understand how it works. But that doesn't mean that DT is always doing the desired thing! I explained above that the current practice is interfering with normal file management procedures, notably DT throws away useful metadata -meaningful timestamps- and hence disturbs a normal incremental backup process which to me is, in an understatement, undesired.

github-actions[bot] commented 1 week ago

This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.