nextcloud / text

📑 Collaborative document editing using Markdown
GNU Affero General Public License v3.0
536 stars 86 forks source link

Content duplicates #5420

Closed juliushaertl closed 5 months ago

juliushaertl commented 5 months ago

I have gotten a report about a file content being duplicated. Checking the file and when this happend I've seen the following access log

Save highlighted and some possibly suspicious large pushs

juliushaertl commented 5 months ago

Looks a bit suspicious, maybe this is related to the file opening process in some way.

FYI @max-nextcloud @mejo-

juliushaertl commented 5 months ago

Just an idea, with some questions that might be worth to investigate:

mejo- commented 5 months ago

I recently encountered this as well (on 27.1.6). Unfortunately I didn't have time to debug in that very moment. In case it happens again, probably a backup of the steps in the database could be helpful, right?

netzpolitikorg commented 5 months ago

We can confirm, at least with 28.0.2, probably also 28.0.3.

juliushaertl commented 5 months ago

Managed to reproduce by:

Steps seem to be still in the steps file and then reapply on top of the new base document

juliushaertl commented 5 months ago

A fix for a possible case is in #5470 however I currently cannot see other code paths that could lead to this

juliushaertl commented 5 months ago

Other code path to check https://github.com/nextcloud/text/blob/9412ae1f2afb6db81c4465a81032e6697bf7cf73/lib/Service/ApiService.php#L139-L159

max-nextcloud commented 5 months ago

Other code path to check

We discussed this today and it looks like this code path is indeed the culprit:

  1. When all sessions are closed the document table entry is removed but the yjs file remains in case someone reopens their laptop and wants to continue the session.
  2. Someone creates a new session and the document is loaded from the markdown ( first if block).
  3. Next person comes around before the doc has been autosaved. Now a document entry exists again - therefore fresh session is false and the content of the yjs state file is send out instead.
  4. The two clients communicate and combine the content they both loaded via different mechanisms
  5. content duplicate.
mejo- commented 5 months ago

This should hopefully be fixed now with all the pieces from https://github.com/nextcloud/text/issues/5476#issuecomment-2010155145

bugspencor commented 2 months ago

In which Nextcloud versions is that bug fixed? I still see a similar symptom in Nextcloud 27.1.10 with Text 3.8.0 (I first noticed it after the update to 27.1.9, but it might have been there already).

Are there any manual steps required or helpful to resolve the issue and get into a clean state?

What I see that sometimes when I download the file the content is correct (only once). But content displayed in the editor (and above the file listing) is duplicate. Sometimes the duplicate content is saved to the file. I experimented with moving appdata_xyz/text/documents/ and after scan-app-data (IIRC) some files showed the content three times.

max-nextcloud commented 2 months ago

@bugspencor thanks for your heads up. Let me ask you some questions first before I reply to yours...

Questions

Answering your questions

In which Nextcloud versions is that bug fixed? I still see a similar symptom in Nextcloud 27.1.10 with Text 3.8.0 (I first noticed it after the update to 27.1.9, but it might have been there already).

The fixes are included in https://github.com/nextcloud/text/pull/5543 and were merged in 27.1.9.

Are there any manual steps required or helpful to resolve the issue and get into a clean state?

You can clear all the current editing sessions with

occ text:reset [file-id] 

Editing sessions are long running to allow reconnects when users open their laptop after a while again.

What I see that sometimes when I download the file the content is correct (only once). But content displayed in the editor (and above the file listing) is duplicate.

That state will be fixed by running the above command for the file in question

Sometimes the duplicate content is saved to the file.

In this case the command won't make any difference.

bugspencor commented 2 months ago

@max-nextcloud thanks for the comment. To answer your questions:

# mariadb nextcloud -B -e "SELECT * FROM oc_text_steps WHERE document_id=433709"
id  document_id session_id  data    version
4526    433709  1110    ["AAKUAgEKzKaluAUABwEHZGVmYXVsdAMHaGVhZGluZwcAzKaluAUABgQAzKaluAUBCURvY3VtZW50cygAzKaluAUABWxldmVsAX0BKADMpqW4BQACaWQBfygAzKaluAUABHV1aWQBf4fMpqW4BQADCXBhcmFncmFwaAcAzKaluAUOBgQAzKaluAUPek5leHRjbG91ZCB3b3JrcyB3ZWxsIHdpdGggYWxsIHRoZSBjb21tb24gZG9jdW1lbnQgZm9ybWF0cy4gWW91IGNhbiBldmVuIGNvbGxhYm9yYXRlIHdpdGggb3RoZXJzIG9uIE9EVCBhbmQgTWFya2Rvd24gZmlsZXMhh8ympbgFDgMJcGFyYWdyYXBoAA==","AAJVAQLMpqW4BYsBqMympbgFDAF3C2gtZG9jdW1lbnRzqMympbgFDQF3JDU1NDFlMTFiLTEzYjEtNGJlNy1hMzc2LTFjOWZhOWJiZGJmMQHMpqW4BQEMAg=="] 2147483647
8047    433709  2626    ["AALsAQEKAAAHAQdkZWZhdWx0AwdoZWFkaW5nBwAAAAYEAAABCURvY3VtZW50cygAAAAFbGV2ZWwBfQEoAAAAAmlkAX8oAAAABHV1aWQBf4cAAAMJcGFyYWdyYXBoBwAADgYEAAAPek5leHRjbG91ZCB3b3JrcyB3ZWxsIHdpdGggYWxsIHRoZSBjb21tb24gZG9jdW1lbnQgZm9ybWF0cy4gWW91IGNhbiBldmVuIGNvbGxhYm9yYXRlIHdpdGggb3RoZXJzIG9uIE9EVCBhbmQgTWFya2Rvd24gZmlsZXMhhwAOAwlwYXJhZ3JhcGgA","AAJIAQLM3ZGPDgCoAAwBdwtoLWRvY3VtZW50c6gADQF3JGE2OTMzYWI3LTY0NjktNDRhYS04ZjM0LTBlNGU4YjgzYjdkMwEAAQwC"] 2147483647
8048    433709  2626    ["AAIrAQHM3ZGPDgKozKaluAWLAQF3DmgtZG9jdW1lbnRzLS0xAcympbgFAYsBAQ=="]    2147483647
8051    433709  2638    ["AALsAQEKAAAHAQdkZWZhdWx0AwdoZWFkaW5nBwAAAAYEAAABCURvY3VtZW50cygAAAAFbGV2ZWwBfQEoAAAAAmlkAX8oAAAABHV1aWQBf4cAAAMJcGFyYWdyYXBoBwAADgYEAAAPek5leHRjbG91ZCB3b3JrcyB3ZWxsIHdpdGggYWxsIHRoZSBjb21tb24gZG9jdW1lbnQgZm9ybWF0cy4gWW91IGNhbiBldmVuIGNvbGxhYm9yYXRlIHdpdGggb3RoZXJzIG9uIE9EVCBhbmQgTWFya2Rvd24gZmlsZXMhhwAOAwlwYXJhZ3JhcGgA","AAJIAQLd8JrOCwCoAAwBdwtoLWRvY3VtZW50c6gADQF3JGM5NzljMmRhLWUyNGUtNGZjYi04YmRlLWYzYmUxZmIwZmE1ZAEAAQwC"] 2147483647
8052    433709  2641    ["AALsAQEKAAAHAQdkZWZhdWx0AwdoZWFkaW5nBwAAAAYEAAABCURvY3VtZW50cygAAAAFbGV2ZWwBfQEoAAAAAmlkAX8oAAAABHV1aWQBf4cAAAMJcGFyYWdyYXBoBwAADgYEAAAPek5leHRjbG91ZCB3b3JrcyB3ZWxsIHdpdGggYWxsIHRoZSBjb21tb24gZG9jdW1lbnQgZm9ybWF0cy4gWW91IGNhbiBldmVuIGNvbGxhYm9yYXRlIHdpdGggb3RoZXJzIG9uIE9EVCBhbmQgTWFya2Rvd24gZmlsZXMhhwAOAwlwYXJhZ3JhcGgA","AAJHAQK15YMNAKgADAF3C2gtZG9jdW1lbnRzqAANAXckNWM0NWI4MjEtYTc1MC00OGIyLWEzZjAtMzI3M2M1MmJhYTI1AQABDAI="] 2147483647

Some other information, maybe there's something relevant:

I have not yet issued any occ text:reset [file-id] commands.

Please let me know if you need anymore info.

max-nextcloud commented 2 months ago
  • It seems the problem can be easily reproduced

Could you try to reproduce this problem from scratch - i.e. creating a new file or uploading one and then creating the problem from there?

I suspect this is due to editing sessions still being around from before the update. However if you can still create the problem with a new file that could not be the case.

If you can reproduce with new files... could you create a screen recording of the steps so i can try and reproduce myself?

Thanks a lot for your support in tracking this down.

bugspencor commented 2 months ago

I could not reproduce the problem with a new file with my account and another user's account.

Probably I can reproduce the problem with other users with existing files. Can I check the session state and their files before logging in and reproducing the issues?

max-nextcloud commented 2 months ago
juliushaertl commented 2 months ago

In addition maybe we could actually add a timestamp or version identifier to the steps table so we could identify old versions in the future more easily.

mejo- commented 2 months ago

Related bugreport in Collectives: https://github.com/nextcloud/collectives/issues/1270