keepassxreboot / keepassxc

KeePassXC is a cross-platform community-driven port of the Windows application “Keepass Password Safe”.
https://keepassxc.org/
Other
20.72k stars 1.43k forks source link

Fix warning messages in Merger.cpp for conflicts #3340

Open Ovsyanka opened 5 years ago

Ovsyanka commented 5 years ago

I noticed that sometimes when I am merging exactly the same databases there are conflicting changes. But not always and not all the entries. Sometimes I see something like this:

$ keepassxc-cli merge --no-password -k testing-db.key -s keepass.kdbx keepass.kdbx

Merge user1/user1 with local on top/under keepass
Entry of user1[5f30370716b6704483a837215ca9c0be] contains conflicting changes - conflict resolution may lose data!
Successfully merged the database files.

Expected Behavior

There should not be any conflicting changes between an equal databases.

Current Behavior

Sometimes there are conflicting changes between an equal databases.

Steps to Reproduce

I tried to reproduce it on the new database, but I failed. I think it is somehow related to the additional info, like "attributes" or "properties" or something. I removed all data from my actual database except one. Then I edited that to remove sensitive data (including history). And I attach the database keepassxc-merge-bug.tar.gz to this issue. To reproduce the issue I wrote the simple script that continuously runs the merge until the issue appears. Eventually you should see the message about the conflicting changes:

$ tar -xzf keepassxc-merge-bug.tar.gz
$ cd keepassxc-merge-bug
$ bash test-merge-db.sh

...

Merge user1/user1 with local on top/under keepass
Database was not modified by merge operation.

Merge user1/user1 with local on top/under keepass
Database was not modified by merge operation.

Merge user1/user1 with local on top/under keepass
Entry of user1[5f30370716b6704483a837215ca9c0be] contains conflicting changes - conflict resolution may lose data!
Successfully merged the database files.

Context

I don't understand what is going on and I am not sure that my data isn't corrupted somehow.

Debug Info

KeePassXC - Version 2.4.3 Revision: 5d6ef0c

Qt 5.12.4 Debugging mode is disabled.

Operating system: Arch Linux CPU architecture: x86_64 Kernel: linux 5.1.15-arch1-1-ARCH

Enabled extensions:

Cryptographic libraries: libgcrypt 1.8.4

Marwe commented 3 years ago

I experienced a similar bahaviour (using the gui and looking at terminal output). In an export and diff of two databases I can see mostly changes of modified date in any entry. If this is the reason, then it may be excluded from alarms. How to merge different dates is another issue, latest should do, if it changes constantly with every saving anyway...

A message window in the GUI notifies me with Database was not modified by merge operation.

                        <Key>_LAST_MODIFIED</Key>
-                       <Value>Fr. Feb. 19 10:06:08 2021 GMT</Value>
+                       <Value>Fr. Feb. 19 10:06:31 2021 GMT</Value>

Tested on keepassxc v. 2.6.4

utoddl commented 2 years ago

This is still happening on 2.6.6. Is there a way to get it to say what is different? Entry of github[xxxxxxxxxxxxxxxxx] contains conflicting changes - conflict resolution may lose data! is rather frustrating.

droidmonkey commented 2 years ago

Use the cli to see the differences. There is no loss of data if you have history enabled in the database settings. The old version of the entry will be stored in history.

rocketraman commented 2 years ago

I have this issue too. A recent merge using the CLI showed output like this for tens or hundreds of entries:

Entry of x[db8be230b0a04679902c161c797adfbb] contains conflicting changes - conflict resolution may lose data!

as well as many like this:

History entry of x[5d7f36ac6e33b0438d65a7269764c26c] at 2016-03-31 13-12-31-000 contains conflicting changes - conflict resolution may lose data!

Looking at the details of these entries shows nothing unusual. There is nothing new in the history (history is enabled), or anything unusual with the entry to indicate why a conflict was reported, and what, if anything, changed. And in fact, many of the reported entries have not changed in years.

keepassxc-cli 2.6.6

rocketraman commented 2 years ago

Could it be a timing issue with an open database? I did have the database open in the GUI, and I did see a message in the GUI saying "a modified database was modified externally, merge?" or something like that, and I may have clicked "yes" on it absent-mindedly, thinking that was part of the CLI merge process. And even if I had thought about it more, it wouldn't be clear to me what the correct action is: if I clicked no or cancelled the dialog, would I lose the merge being made by the CLI?

kdrobnyh commented 2 years ago

Is there any update? I'm having exactly the same issue when I try to merge two identical databases. Looks like merge operation can't be trusted for now.

keepassxc-cli 2.7.0

droidmonkey commented 2 years ago

This is just a warning and only triggered when an entry has modified times that only differ by milliseconds. This can happen when saved on different systems that have different time precision. This is per the comment in the conflict resolution function: https://github.com/keepassxreboot/keepassxc/blob/develop/src/core/Merger.cpp#L370

Really we shouldn't be showing these messages, we resolve the conflict anyway and store the overwritten data as history in the target entry if there were actual changes.

kdrobnyh commented 2 years ago

@droidmonkey, in my case two files are completely identical, byte-for-byte.

droidmonkey commented 2 years ago

Then it's just the algorithm getting confused and can be ignored. The wording of the warning is really bad

rocketraman commented 2 years ago

Then it's just the algorithm getting confused and can be ignored.

The first part of this sentence does not appear to fit well with the second part :-) If the algorithm is getting confused... why is that ok / not a bug?

Can you explain a bit more about why this confusion is benign?

droidmonkey commented 2 years ago

These messages are output when the modification times are exactly the same (ignoring milliseconds) between the two entries being merged, but for some reason a deep compare found some difference in the two. Where that difference is, tough to tell. This includes last access time. We should probably also add the CompareItemIgnoreStatistics flag to this message.

kdrobnyh commented 2 years ago

But how can this explain merging two identical databases? Because If they are byte-for-byte identical, access/modification/etc. times are also the same.

droidmonkey commented 2 years ago

I did a debug of the test database posted by the OP above and the mismatch occurred while testing the custom data for equality. The custom data just does a simple QHash comparison so something must be minorly different between the two causing the warning message but there is no actual difference. I checked the values, they are they same. This could be caused by a discrepancy in the _LAST_MODIFIED field. I certainly could not reproduce the problem using 2.7.1 and a complex demo database. I can reproduce it using the sample database (which is from 2.4.3 era).

Ovsyanka commented 2 years ago

I certainly could not reproduce the problem using 2.7.1 and a complex demo database. I can reproduce it using the sample database (which is from 2.4.3 era).

@droidmonkey I believe you are right blaming _LAST_MODIFIED. I created the new database with 2.7.1, then added the new Entry through the Firefox addon (current version 1.7.2). The Entry created with _LAST_MODIFIED and KeePassXC-Browser Settings rows on the Properties page and the issue is still there. If I remove the KeePassXC-Browser Settings (_LAST_MODIFIED row goes with it) - then I don't see the issue anymore.

louib commented 1 year ago

I believe I'm also hitting that bug when merging databases that were created by keepass-rs, as described here.

louib commented 1 year ago

I dug a bit deeper, and the problem in my case seems to be related to field protection. The password field is protected by default, and this default is correctly configured in both KeePassXC and keepass-rs. However, keepass-rs will not populate a new entry with the default fields, whereas KeePassXC will create all the 5 default fields when creating a new entry. When creating the entry attributes from scratch, KeePassXC will not set the field protection based on the database meta settings, as seen here. I believe this bug is never triggered when the database is never used by applications other than KeePassXC, because KeePassXC must correctly initialize the field protection at some other location during entry creation. I dumped a database to xml and it seems to be the case.

I'm not 100% sure that this is the same problem as described initially by @Ovsyanka, but it could very well be the case. For example, maybe KeePassXC used to not create all the default fields by default. One way to find out would be to dump the database to xml and see if some entries don't have a password field defined (I don't mean empty, I mean not in the xml at all).

I think the higher-level discussion to have is: How can we make sure that we don't add more fields to an entry that was parsed from a database? This will make interoperability with both older versions of KeePassXC and other KDBX4 parsers more robust. Maybe we could add tests to make sure that we don't add additional fields to an entry after parsing it?

Martin-Buchholz commented 1 year ago

I'm seeing this too. I'm a new user, having only ever used 2.7.4. I see the warnings apparently only on entries that I have used via KeePassXC-Browser in Chrome (on the source database only), which seems to confirm what others are seeing.

The warnings appear benign - nothing is actually broken - so perhaps the warnings need to be understood, then have a regression test written, then silenced.

There should probably be documentation help for users (like me) who want to maintain a master version of the database, then use that database copied to multiple machines with multiple browsers.

Ovsyanka commented 1 year ago

I just noticed that the value of _LAST_MODIFIED property updates whether I open "Edit entry" dialogue. So at the moment I suppose it updates internally in the process of merging for the both sides and from time to time the seconds value mismatch.

Ovsyanka commented 11 months ago

I explored the code and I believe I found the reason of Entry of ... contains conflicting changes - conflict resolution may lose data! error (or at least some of them).

The error appears if two compared records have the same lastModificationTime, but they don't considered equal (by calling entry->equals()). This could be happen because CustomData::LastModified is refreshed for sourceEntry passed in mergeHistory:

sourceEntry->clone -> entry->m_customData->copyDataFrom -> CustomData::updateLastModified

And this is the explanation why _LAST_MODIFIED property updates in the interface. I would say this is different manifestation of the same underlying flaw:

The _LAST_MODIFIED property refreshed each time CustomData::updateLastModified called. And one of the calls is from CustomData::copyDataFrom. This function is used in lot of places including EditEntryWidget::setForms - this is the reason behind it is updated each time the entity is open.

droidmonkey commented 11 months ago

Great!

torypages commented 7 months ago

One really odd thing I just observed is that the results are not consistent. As I run and re-run a merge with a dry run, I get different results!

edit: when running the merge from the cli (everything I'm doing is from the cli the copyDataFrom (mentioned above) method does run. Also, I'm not a c++ developer so I'm way in over my head :P )

edit: So are we cloning the source and the destination and refreshing timestamps in the process? If the one clone happens more than "ignore milliseconds" later, then we'd get this error. Though I question whether the last_modified timestamp should be getting updated during a merge. In a sense it's been modified, but in another it hasn't.

edit: maybe custom data isn't honouring https://github.com/keepassxreboot/keepassxc/blob/884386c924192902dc9500a58f9cbdfe22a0a4fd/src/core/Entry.cpp#L912 ?

Maybe https://github.com/keepassxreboot/keepassxc/blob/884386c924192902dc9500a58f9cbdfe22a0a4fd/src/core/CustomData.cpp#L150 needs to be wrapped with canUpdateTimeinfo?

edit: I added some debug logging to Merger.cpp

    const QDateTime targetModificationTime = Clock::serialized(targetEntry->timeInfo().lastModificationTime());
    const QDateTime sourceModificationTime = Clock::serialized(sourceEntry->timeInfo().lastModificationTime());
    if (targetModificationTime == sourceModificationTime
        && !targetEntry->equals(sourceEntry,
                                CompareItemIgnoreMilliseconds | CompareItemIgnoreHistory | CompareItemIgnoreLocation)) {
        ::qWarning("Entry of %s[%s] contains conflicting changes - conflict resolution may lose data!",
                   qPrintable(sourceEntry->title()),
                   qPrintable(sourceEntry->uuidToHex()));
        qWarning("%lld", sourceEntry->customData()->lastModified().toMSecsSinceEpoch());
        qWarning("%lld", targetEntry->customData()->lastModified().toMSecsSinceEpoch());

and there was a difference in the MSecs!

1706071416000
1706071415000

edit: even while initially opening the database the dates get updated. For example, code paths stemming from parseKeePassFile (parseMeta() and parseRoot()) seem to result in an update.

torypages commented 7 months ago

I've fiddled around with this quite a bit. I'm limited by my lack of C++ knowledge. Nonetheless, I've made some code changes that seem to work for me, but before getting into that, what is the purpose of updating dates during a copy? https://github.com/keepassxreboot/keepassxc/blob/884386c924192902dc9500a58f9cbdfe22a0a4fd/src/core/CustomData.cpp#L150

For me, a copy shouldn't mutate the data. If we get rid of this then things start to work right.

I currently have https://github.com/torypages/keepassxc/pull/1/files, but I think even that is overly complex, all I think that needs to be done is remove the update from the copy method, which to me seems more consistent with a copy anyway.

droidmonkey commented 7 months ago

I do agree the update last modified call doesn't make sense, BUT, that's only if the last modified field already exists in the copied data. We need to add a check to see if last modified exists, if it doesn't then call update last modified.

torypages commented 7 months ago

Awesome! Thanks for the reply! I pushed a new commit https://github.com/torypages/keepassxc/pull/1/commits/6f263f752201b3053b2f970c4fff7870603350da that I think covers what was mentioned. And it still seems to work.

If this is on a good path I can start looking at unit tests (I don't even know how to run them at the moment), and get a polished PR going.

setUpdateTimeinfo/canUpdateTimeinfo is maybe a bit overkill, we maybe check for the _LAST_UPDATED key right inside updateLastModified, however the current implementation keeps the change more segregated to copyDataFrom which is probably a good thing.

I have no personal attachment to the implementation, I just want to make a custom workflow for keeping a DB on my desktop and a subset DB on my phone in sync and didn't want to ignore warnings :)

I am for sure not a c++ dev, or an expert in this codebase. My view on this is super narrow. I'm a Python dev (though an experienced one) who randomly mashed some keys in this project and "seemed" to get a result that worked. That is to say, don't trust me that a future PR is good.

Let me know your thoughts on https://github.com/torypages/keepassxc/pull/1/files and I can iterate a bit, make it nice, look at tests. Thanks!