darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.77k stars 1.14k forks source link

All XMP are rewriting when "update database from selected xmp files" #5847

Closed esq4 closed 4 years ago

esq4 commented 4 years ago

I am editing the same shots on Windows or Linux computers. When I migrate my collection from Linux to Windows (or vice vers) all XMPs are overwritten (I use the "check xmp on start" option and click "update database from selected xmp files" in the crawler dialog). But this only changes the values "import_timestamp" and "change_timestamp". And this values for each individual snapshot are always the same. For example: on Linux

   darktable:import_timestamp="1594311247"
   darktable:change_timestamp="1594311413"

on Windows

   darktable:import_timestamp="1594233276"
   darktable:change_timestamp="1594233508"

Darktable Version: master OS: Debian 10 and Windows 7

esq4 commented 4 years ago

I think there are two problems here: 1) overwriting XMPs when click "update database from selected xmp files" 2) difference in calculating values of timestamps in Linux and Windows

esq4 commented 4 years ago

all XMPs are overwritten

"all" means absolutely everything, including unchanged.

phweyland commented 4 years ago
2\. difference in calculating values of timestamps in Linux and Windows

Pinging @jpverrue . He may have an idea about this.

esq4 commented 4 years ago

2. difference in calculating values of timestamps in Linux and Windows

Here it is necessary to clarify what data is written to the database when importing a directory containing XMPs. Perhaps this judgment is wrong.

jpverrue commented 4 years ago

Several remarks and reflections:

phweyland commented 4 years ago

Another comment here, even if maybe not directly related. Windows and linux on a dual boot, usually do not show the same time. To get the same time I had to do something like this on windows side:

Reg add HKLM\SYSTEM\CurrentControlSet\Control\TimeZoneInformation /v RealTimeIsUniversal /t REG_QWORD /d 1

esq4 commented 4 years ago
  • Are the machine clock and timezone set to exactly the same date on both machines?

Timezone is set the same on both machines. Clocks synchronized to same NTP-server.

  • The difference between Windows and Linux timestamps is not exactly the same: 77971 seconds for import_timestamp and 77905 seconds for change_timestamp. That's a little more than 21h30mn. Does this value indicate something?
  • The 66 seconds difference between the two raises a question. If it had been the same between the two, we could have thought that there was a clock offset problem, but it seems that there is something else.
  • There is another timestamp mechanism in darktable, purely internal (it is not displayed). It is used to identify XMP OoD files. It is only used when resuming old databases to value import_timestamp for already imported photos. Could this interfere?

Until I all the same I think that the problem - in overwriting XMPs. But I haven’t figured out the mechanism "update database from selected xmp files" yet.

P.S. And yes, some time ago, back in winter, this problem didn't exist.

Mark-64 commented 4 years ago

This happens not only across Windows and Linux, but also with two different configuration folders with the same OS. It also has collateral effects, as described in #5722 It seems DT always updates XMP timestamps, so they are alway detected as new when launching another DT instance. And DT 3.0.x did not have this issue

esq4 commented 4 years ago

Seems to be a consequence of commit 0d9ccafc1c2d6eb468af404dd8a58ed80763b922 "Add four new timestamps columns in library.db"

phweyland commented 4 years ago

It seems DT always updates XMP timestamps

Do you mean it's enough to open dt or an image in darkroom to update an xmp ? Can you be more explicit ? What are the actions which update xmp ?

Sometime ago I've fixed one case of permanent xmp write (#4448) in order not to save it when moving from an image to the other in darkroom without change. But this PR is not in 3.0 afaik.

esq4 commented 4 years ago

Do you mean it's enough to open dt or an image in darkroom to update an xmp ? Can you be more explicit ? What are the actions which update xmp ?

Yes, enough to open dt and click "update database from selected xmp files" in the crawler dialog on another computer or OS (or with another configuration folders with the same OS).

phweyland commented 4 years ago

It seems DT always updates XMP timestamps, so they are alway detected as new when launching another DT instance.

Yes, enough to open dt and click "update database from selected xmp files" in the crawler dialog on another computer or OS (or with another configuration folders with the same OS).

If the user "update database from selected xmp files" and "write sidecar file for each image" is set, it seems normal to get the xmp written again. It could be considered, in case the only difference is the file timestamp, not to write the file. There is comment for this in image.c:

  // TODO: compute hash and don't write if not needed!

And DT 3.0.x did not have this issue

I don't understand how this could have worked.

esq4 commented 4 years ago

If the user "update database from selected xmp files" and "write sidecar file for each image" is set, it seems normal to get the xmp written again. It could be considered, in case the only difference is the file timestamp, not to write the file.

I don’t think so. In my opinion, these are antagonistic actions in meaning. "update database from selected xmp files" means database synchronization with collection. If the user selects "update database from selected xmp files", then he believes (and must be sure) that all data from the XMP will be transferred to the database. Incl. timestamps. And in this case, "update database from selected xmp files" should take precedence over "write sidecar file for each image".

phweyland commented 4 years ago

Ok, I think I got it wrong. If xmp timestamp is newer than db write timestamp, and the user requests "update database from selected xmp files", db is updated with xmp data (as today) but

  1. db write time stamp should be updated with the xmp file timestamp
  2. the xmp file should not be updated (as now db data = xmp data) even if "write sidecar file for each image" is set.

Does that make more sense ?

esq4 commented 4 years ago

Yes, more sense, but you've made things too complicated with the new "db write time stamp" option. This takes us away from "update database from selected xmp files" and leads to "synchronize database with selected xmp files" (I propose to postpone this idea for later). To "update" it is enough just to copy all parameters from XMP to db.

phweyland commented 4 years ago

To "update" it is enough just to copy all parameters from XMP to db.

This is not true. EDIT just because if you don't do it, crawler won't stop to detect the difference.

esq4 commented 4 years ago

Wel, I'm completely confused. I will ask a question from the middle. Why in the _dt_image_cache_writerelease the value of _img->importtimestamp is different from _darktable:importtimestamp in XMP?

phweyland commented 4 years ago

Well done. There is another issue here in dt_image_cache_write_release():

DT_DEBUG_SQLITE3_PREPARE_V2( dt_database_get(darktable.db), "UPDATE main.images SET width = ?1, height = ?2, filename = ?3, maker = ?4, model = ?5, " "lens = ?6, exposure = ?7, aperture = ?8, iso = ?9, focal_length = ?10, " "focus_distance = ?11, film_id = ?12, datetime_taken = ?13, flags = ?14, " "crop = ?15, orientation = ?16, raw_parameters = ?17, group_id = ?18, longitude = ?19, " "latitude = ?20, altitude = ?21, color_matrix = ?22, colorspace = ?23, raw_black = ?24, " "raw_maximum = ?25, aspect_ratio = ROUND(?26,1), exposure_bias = ?27, " "change_timestamp = ?28, change_timestamp = ?29, export_timestamp = ?30, print_timestamp = ?31 " "WHERE id = ?32", -1, &stmt, NULL);

The column for ?28 should be import_timestamp instead. That may explain other discrepancy you see. I'll add this to the PR.

esq4 commented 4 years ago

Yes it's good. But I wrote about value of _img->importtimestamp (or _img->changetimestamp) and how they differ from the same in XMP.

here

_DT_DEBUG_SQLITE3_BIND_INT(stmt, 28, img->import_timestamp); DT_DEBUG_SQLITE3_BIND_INT(stmt, 29, img->change_timestamp); DT_DEBUG_SQLITE3_BIND_INT(stmt, 30, img->export_timestamp); DT_DEBUG_SQLITE3_BIND_INT(stmt, 31, img->printtimestamp);

phweyland commented 4 years ago

But I wrote about value of _img->importtimestamp (or _img->changetimestamp) and how they differ from the same in XMP.

Not sure to understand the question ... my answers may be inadequate.

Both values (even all img values) should be the same as in xmp and in img as soon as the db and xmp are synchronized. The lines you highlight are writing these values in the database. Few lines later you have this, which writes the same values into the xmp files:

  if(mode == DT_IMAGE_CACHE_SAFE)
    dt_image_write_sidecar_file(img->id);

When synchronization is set ("write sidecar file for each image" set) database, img and xmp all (should) contain the same data.

Then you have the other issue, related to the write time of the xmp file itself (file attribute and not content) and the corresponding write_timestamp which is in the db and img, but not in the xmp. This value is used to trigger changes from xmp. Obviously working with two databases and the same xmp files, brings useless and endless triggering. I believe the PR fixes also this.

Again, sorry if I have still not answered your question.

esq4 commented 4 years ago

I will ask differently.

In the _dt_image_cache_writerelease

img->import_timestamp != {XMP::} darktable:import_timestamp and img->change_timestamp != {XMP::} darktable:change_timestamp

i.e. 1) I open XMP in text editor and see

darktable:import_timestamp = "1594233276" darktable:change_timestamp = "1594233508"

2) start debuging 3) select this XMP in crawler 4) stop debuging at string

DT_DEBUG_SQLITE3_BIND_INT(stmt, 28, img->import_timestamp); DT_DEBUG_SQLITE3_BIND_INT(stmt, 29, img->change_timestamp);

5) see what

img->import_timestamp == 1594311247 img->change_timestamp == 1594311413

Why?

(maybe img gets values from db?)

phweyland commented 4 years ago

Yes, "Img" is the cache of the database. So before using the data, dt fetches the database data to img. After any change, dt_image_cache_write_release() is called to update the database accordingly (the part where you stop the debugger). Just after in the code, if necessary, the same data are written to the xmp. So at the time you have stopped the debugger the data in xmp are not yet updated. Therefore the difference may be normal. To make the real comparison you have to break after

  if(mode == DT_IMAGE_CACHE_SAFE)
    dt_image_write_sidecar_file(img->id);

And there, the values should match.

This said, that does not explain what does this difference mean. For that we would have to go where the change of these values is triggered.

esq4 commented 4 years ago

So before using the data, dt fetches the database data to img. After any change, dt_image_cache_write_release() is called to update the database accordingly (the part where you stop the debugger). Just after in the code, if necessary, the same data are written to the xmp. So at the time you have stopped the debugger the data in xmp are not yet updated. Therefore the difference may be normal.

This is correct, only if we are working with an image, but not in this case. Now this has led to endless overwriting of XMP when transferring a collection between computers (for example) since the timestamps in the unsynchronized databases of different computers are necessarily different.

When I click "update database from selected xmp files" I mean that the timestamps from XML will also be written to the db since these are the same editing steps as the others. And these values must match in XMP and db, not because they were copied from db to XMP, but on the contrary, because they were copied from XMP to db.

upd Or these timestamps should be excluded from the comparison process.

phweyland commented 4 years ago

This is correct, only if we are working with an image, but not in this case.

I don't agree. That is what the code is.

When I click "update database from selected xmp files" I mean that the timestamps from XML will also be written to the db since these are the same editing steps as the others. And these values must match in XMP and db, not because they were copied from db to XMP, but on the contrary, because they were copied from XMP to db.

I agree that after update the xmp data should not have changed. If you see discrepancy between the former and the new xmp (except the file timestamp and import_timestamp value (which is buggy)), there should be another issue. Otherwise I think we are correct. Is the content of xmp changed or unchanged ? If changed, what are the difference ?

esq4 commented 4 years ago

This is like a blind person talking to a deaf person.

Your quotes from the https://github.com/darktable-org/darktable/pull/5869

In that case, the current system sees the xmp timestamps left by the other system and show up the corresponding list of images. If the user click on "update database from selected xmp files" dt updates the current database but also the xmp files, changing again the xmp timestamps. So each time the user switches to the other system and/or database, dt triggers again the change detection.

Exactly!

Workaround ? Not really but the effect can be limited. Before "update database from selected xmp files" only select the modified images. Or re-import manually these modified images or folders. That will limit the issue to those modified folders or images (instead of all images).

That is, user must have to manually do the work that should to be done (and was done before) automatically. And user must remember what he edited and try to ignore the rest.

This is regression.

esq4 commented 4 years ago

Could it be easier to exclude these time stamps from the comparison process?

phweyland commented 4 years ago

Could it be easier to exclude these time stamps from the comparison process?

I think you are still confusing the different timestamps.

This is regression.

I don't think this is regression neither because the timestamp comparison has been written in 2014 ... The other timestamps, if I'm not mistaken, are not involved in that comparison process.

This is like a blind person talking to a deaf person.

Agreed. It seems we don't understand each other.

You have not answered the question:

Is the content of xmp changed or unchanged ? If changed, what are the difference ?

pahervin commented 4 years ago

Hi, That looks a conflict about which timestamp is the best...

I'm a newby in DT issues, but as an experimented software engineer this issue has a very simple fix.

When the user require to update de DB with the XMP file, he indicates that the XMP data must become the reference within the DB INCLUDING the timestamps. DT only needs to update with the XMP datas. Since the datas becomes the same within DB and the XMP, there is no need to update the XMP.

And that all.

This fixes many of the DT usages described in this thread. And very important, it avoids to rewrite thousands of file for nothing useful.

EdgarVerona commented 3 years ago

I don't know if this is necessarily helpful, but on Windows 10 I am experiencing this exact issue still. Every xmp file is ~1 second newer than the database entry, though the only change occurred during their initial import.

image

At first I thought darktable was just very slow to load, as there's no UI response for several minutes until it finally brings up this dialog. But then I noticed the timestamp difference and the fact that I haven't changed those XMP files, and when I found this bug it seems to be a very similar issue.

These timestamp differences are happening on a system where both the database and images/xmp files are hosted on a Samba share on a Linux server, but the application is being run from my local windows machine.

(I suppose since my workflow is simple enough - I only plan to do basic editing in darktable itself - I can just turn off the "check for xmp updates at startup" setting since the only thing that should be editing them is darktable, so I'm personally not blocked by this)

esq4 commented 3 years ago

@EdgarVerona It's probably better to create a new issue with reference to this one.

EdgarVerona commented 3 years ago

Ah, good point! Thanks, will do once this kiddo goes down for a nap.