Closed mgriego closed 3 years ago
@mgriego that is some excellent analysis. I'll try to address this soon and experiment with your suggestions. Thanks a lot, highly appreciated :pray:
@mgriego would you be so kind as to test the new version from master? There is an underlying query change that has (at least for me) made the bug disappear.
Thank you!
This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.
I've started looking into the best way to fix this. I personally believe that the best way to fix this is to use an outer transaction in
ImapToDbSynchronizer::sync
. This way, if any portion of the sync fails, then the entire DB transaction will be rolled back, assuming the DBMS supports transactions properly. As it stands, transactions are only used in theMessageMapper
during the bulk insert/update methods.
I was finally able to wrap my head around this as I had a setup that allowed me to step through this locally in a totally reproducible way.
So, it turns out the logic to unlock a mailbox after said exception is run. However, it has no effect. The mailbox is left with its locks, despite the UPDATE queries run to remove them. With the help of https://stackoverflow.com/questions/16627942/php-pdo-transaction-automatic-rollback I realized that you actually do not get an automatic rollback with PDO. This was my fundamental misunderstanding of transactions in PDO. So I think whenever \OCP\IDBConnection::beginTransaction
is used there should be a try-catch afterwards to roll back explicitly when an error occurs.
@mgriego would appreciate your feedback on https://github.com/nextcloud/mail/pull/5440 as you seem to have a solid understanding of this problem :v:
Expected behavior
Mailbox sync should be able to recover from prior failures.
Actual behavior
If a previous mailbox sync fails (ie out of memory, execution time runs out, etc) in between the time that the messages begin being inserted in into the database and the time that the sync token is saved to the mailbox, then future syncs may fail resulting in a mailbox appearing stuck in a "Loading Messages" state. This can happen because the sync token stores the most recently-retrieved UID to pass to the server on the next fetch. If it is not updated due to the failure, then the Mail app will try to re-insert previously-downloaded messages, resulting in an SQL primary key constraint violation. This will throw an exception that is not accounted for, resulting in a 500 response. Note that, when this happens, subsequent sync calls from the browser will result in 409 responses as the sync locks do not get cleaned up when the exception is thrown.
Here is a real life example of this happening to me:
VTI1NDg3OSxWMSxINjM3OTAx
. When youbase64_decode
this, you see that the previous UID is254879
.oc_mail_messages
table, the three most recent UID values for the mailbox with the sync token shown above are254878
,254880
, and254881
.You can see from here that the UID referenced in the sync token has been removed from the mailbox, but two more messages (254880 and 254881) have been previously synced. Since the token thinks the most recently-synced message is 254879, it uses that when retrieving messages, resulting in Mail attempting to re-insert messages 254880 and 254881. Of course, the constraint violation happens as soon as Mail attempts to insert 254880, resulting in this error:
The above was also mentioned in https://github.com/nextcloud/mail/issues/4095#issuecomment-757390230.
I've started looking into the best way to fix this. I personally believe that the best way to fix this is to use an outer transaction in
ImapToDbSynchronizer::sync
. This way, if any portion of the sync fails, then the entire DB transaction will be rolled back, assuming the DBMS supports transactions properly. As it stands, transactions are only used in theMessageMapper
during the bulk insert/update methods.Another possible fix is to update the mailbox after each (new messages/update messages/vanished messages), that way there's less of a possibility of a failure in later syncs (ie update/delete) causing earlier sync tokens (ie create) to not get updated.
There's also the option of removing messages that were previously synced that have a higher UID than what's in the "new" sync token, however that seems really problematic. The
ImapToDbSynchronizer
is not exected to understand the internals of the sync token, and deleting messages from the database between the new and changed sync operations could cause further regressions.Mail app
Mail app version: 1.7.2
Mailserver or service: Locally-hosted Zimbra
Server configuration
Operating system: Using Nextcloud Docker image:
nextcloud:20-apache
Web server: Apache via Docker image behind a Nginx reverse proxy running on the Docker host
Database: MariaDB via
latest
Docker image (as of about a week ago)PHP version: PHP 7.4 (via Docker image)
Nextcloud Version: 20.0.4
Client configuration
Browser: Chrome 87.0.4280.141
Operating system: Xubuntu 20.04.1