nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.15k stars 4.03k forks source link

Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction #6899

Closed mxschmitt closed 1 year ago

mxschmitt commented 6 years ago

Steps to reproduce

  1. Reinstalled fresh nextcloud
  2. Idk what I did. My webdav client was syncing files with the official nextcloud client parallel.

Expected behaviour

No error

Actual behaviour

Doctrine\DBAL\Exception\DriverException: An exception occurred while executing 'INSERT INTO `oc_file_locks` (`key`,`lock`,`ttl`) SELECT ?,?,? FROM `oc_file_locks` WHERE `key` = ? HAVING COUNT(*) = 0' with params ["files\/b7aec8354372d9ccd750400d8d03800f", 1, 1508666706, "files\/b7aec8354372d9ccd750400d8d03800f"]: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction
/var/www/cloud/3rdparty/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php - line 128: Doctrine\DBAL\Driver\AbstractMySQLDriver->convertException('An exception oc...', Object(Doctrine\DBAL\Driver\PDOException))
/var/www/cloud/3rdparty/doctrine/dbal/lib/Doctrine/DBAL/Connection.php - line 1015: Doctrine\DBAL\DBALException driverExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDOMySql\Driver), Object(Doctrine\DBAL\Driver\PDOException), 'INSERT INTO `oc...', Array)
/var/www/cloud/lib/private/DB/Connection.php - line 213: Doctrine\DBAL\Connection->executeUpdate('INSERT INTO `oc...', Array, Array)
/var/www/cloud/lib/private/DB/Adapter.php - line 114: OC\DB\Connection->executeUpdate('INSERT INTO `oc...', Array)
/var/www/cloud/lib/private/DB/Connection.php - line 251: OC\DB\Adapter->insertIfNotExist('*PREFIX*file_lo...', Array, Array)
/var/www/cloud/lib/private/Lock/DBLockingProvider.php - line 118: OC\DB\Connection->insertIfNotExist('*PREFIX*file_lo...', Array, Array)
/var/www/cloud/lib/private/Lock/DBLockingProvider.php - line 163: OC\Lock\DBLockingProvider->initLockField('files/b7aec8354...', 1)
/var/www/cloud/lib/private/Files/Storage/Common.php - line 704: OC\Lock\DBLockingProvider->acquireLock('files/b7aec8354...', 1)
/var/www/cloud/lib/private/Files/Storage/Wrapper/Wrapper.php - line 588: OC\Files\Storage\Common->acquireLock('files/01-Arbeit...', 1, Object(OC\Lock\DBLockingProvider))
/var/www/cloud/lib/private/Files/View.php - line 1908: OC\Files\Storage\Wrapper\Wrapper->acquireLock('files/01-Arbeit...', 1, Object(OC\Lock\DBLockingProvider))
/var/www/cloud/lib/private/Files/View.php - line 2018: OC\Files\View->lockPath('01-Arbeit/01-Be...', 1, false)
/var/www/cloud/lib/private/Files/View.php - line 1309: OC\Files\View->lockFile('01-Arbeit/01-Be...', 1)
/var/www/cloud/lib/private/Files/View.php - line 1356: OC\Files\View->getCacheEntry(Object(OCA\Files_Trashbin\Storage), 'files/01-Arbeit...', '01-Arbeit/01-Be...')
/var/www/cloud/apps/dav/lib/Connector/Sabre/ObjectTree.php - line 160: OC\Files\View->getFileInfo('/maxibanki/file...')
/var/www/cloud/apps/dav/lib/Connector/Sabre/LockPlugin.php - line 59: OCA\DAV\Connector\Sabre\ObjectTree->getNodeForPath('01-Arbeit/01-Be...')
[internal function] OCA\DAV\Connector\Sabre\LockPlugin->getLock(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))
/var/www/cloud/3rdparty/sabre/event/lib/EventEmitterTrait.php - line 105: call_user_func_array(Array, Array)
/var/www/cloud/3rdparty/sabre/dav/lib/DAV/Server.php - line 466: Sabre\Event\EventEmitter->emit('beforeMethod', Array)
/var/www/cloud/3rdparty/sabre/dav/lib/DAV/Server.php - line 254: Sabre\DAV\Server->invokeMethod(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))
/var/www/cloud/apps/dav/appinfo/v1/webdav.php - line 76: Sabre\DAV\Server->exec()
/var/www/cloud/remote.php - line 162: require_once('/var/www/cloud....')
{main}

Server configuration

Operating system: Debian 9.1 PXE:

Web server: nginx/1.10.3

Database: MariaDB

PHP version: 7.1.9

Nextcloud version: 12.0.3

Updated from an older Nextcloud/ownCloud or fresh install: Fresh Install

Where did you install Nextcloud from: https://download.nextcloud.com/server/releases/nextcloud-12.0.3.zip

Signing status:

Signing status `No errors have been found.`

List of activated apps:

App list ``` Enabled: - activity: 2.5.2 - bruteforcesettings: 1.0.2 - comments: 1.2.0 - dav: 1.3.0 - federatedfilesharing: 1.2.0 - federation: 1.2.0 - files: 1.7.2 - files_pdfviewer: 1.1.1 - files_sharing: 1.4.0 - files_texteditor: 2.4.1 - files_trashbin: 1.2.0 - files_versions: 1.5.0 - files_videoplayer: 1.1.0 - firstrunwizard: 2.1 - gallery: 17.0.0 - logreader: 2.0.0 - lookup_server_connector: 1.0.0 - nextcloud_announcements: 1.1 - notifications: 2.0.0 - oauth2: 1.0.5 - password_policy: 1.2.2 - provisioning_api: 1.2.0 - serverinfo: 1.2.0 - sharebymail: 1.2.0 - survey_client: 1.0.0 - systemtags: 1.2.0 - theming: 1.3.0 - twofactor_backupcodes: 1.1.1 - updatenotification: 1.2.0 - workflowengine: 1.2.0 Disabled: - admin_audit - encryption - files_external - user_external - user_ldap ```

Nextcloud configuration:

Config report ``` { "system": { "instanceid": "ocqw4m1v6kk0", "passwordsalt": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "trusted_domains": [ "cloud" ], "datadirectory": "\/var\/www\/cloud\/data", "overwrite.cli.url": "https:\/\/cloud", "dbtype": "mysql", "version": "12.0.3.3", "dbname": "nextcloud", "dbhost": "10.10.10.6", "dbport": "", "dbtableprefix": "oc_", "mysql.utf8mb4": true, "dbuser": "***REMOVED SENSITIVE VALUE***", "dbpassword": "***REMOVED SENSITIVE VALUE***", "installed": true, "mail_from_address": "cloud", "mail_smtpmode": "php", "mail_smtpauthtype": "LOGIN", "mail_domain": "schmitt.mx" } } ```

Are you using external storage, if yes which one: local/smb/sftp/.. No.

Are you using encryption: No

Are you using an external user-backend, if yes which one: LDAP/ActiveDirectory/Webdav/... Only webdav

Client configuration

Browser: Chrome 62

Operating system: Windows 7 Enterprise

Logs

Web server error log

Web server error log ``` No relevant logs ```

Nextcloud log (data/nextcloud.log)

Nextcloud log ``` { "reqId": "aGDjWevMFpuYXpWX3mH7", "level": 4, "time": "2017-10-22T08:38:10+00:00", "remoteAddr": "", "user": "maxibanki", "app": "webdav", "method": "PUT", "url": "\/remote.php\/webdav\/01-Arbeit\/01-Berufsschule\/00-Archiv\/Steinam-XSLTalt\/.git\/objects\/f1\/08a244c01ab24b81e8a17bf0576734776a3e81", "message": "Exception: {\"Exception\":\"Doctrine\\\\DBAL\\\\Exception\\\\DriverException\",\"Message\":\"An exception occurred while executing 'INSERT INTO `oc_file_locks` (`key`,`lock`,`ttl`) SELECT ?,?,? FROM `oc_file_locks` WHERE `key` = ? HAVING COUNT(*) = 0' with params [\\\"files\\\\\\\/3d31700630071b7bfc95805f8a422ab1\\\", 1, 1508665090, \\\"files\\\\\\\/3d31700630071b7bfc95805f8a422ab1\\\"]:\\n\\nSQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction\",\"Code\":0,\"Trace\":\"#0 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/doctrine\\\/dbal\\\/lib\\\/Doctrine\\\/DBAL\\\/DBALException.php(128): Doctrine\\\\DBAL\\\\Driver\\\\AbstractMySQLDriver->convertException('An exception oc...', Object(Doctrine\\\\DBAL\\\\Driver\\\\PDOException))\\n#1 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/doctrine\\\/dbal\\\/lib\\\/Doctrine\\\/DBAL\\\/Connection.php(1015): Doctrine\\\\DBAL\\\\DBALException::driverExceptionDuringQuery(Object(Doctrine\\\\DBAL\\\\Driver\\\\PDOMySql\\\\Driver), Object(Doctrine\\\\DBAL\\\\Driver\\\\PDOException), 'INSERT INTO `oc...', Array)\\n#2 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/DB\\\/Connection.php(213): Doctrine\\\\DBAL\\\\Connection->executeUpdate('INSERT INTO `oc...', Array, Array)\\n#3 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/DB\\\/Adapter.php(114): OC\\\\DB\\\\Connection->executeUpdate('INSERT INTO `oc...', Array)\\n#4 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/DB\\\/Connection.php(251): OC\\\\DB\\\\Adapter->insertIfNotExist('*PREFIX*file_lo...', Array, Array)\\n#5 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Lock\\\/DBLockingProvider.php(118): OC\\\\DB\\\\Connection->insertIfNotExist('*PREFIX*file_lo...', Array, Array)\\n#6 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Lock\\\/DBLockingProvider.php(163): OC\\\\Lock\\\\DBLockingProvider->initLockField('files\\\/3d3170063...', 1)\\n#7 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/Storage\\\/Common.php(704): OC\\\\Lock\\\\DBLockingProvider->acquireLock('files\\\/3d3170063...', 1)\\n#8 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/Storage\\\/Wrapper\\\/Wrapper.php(588): OC\\\\Files\\\\Storage\\\\Common->acquireLock('files\\\/01-Arbeit...', 1, Object(OC\\\\Lock\\\\DBLockingProvider))\\n#9 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/View.php(1908): OC\\\\Files\\\\Storage\\\\Wrapper\\\\Wrapper->acquireLock('files\\\/01-Arbeit...', 1, Object(OC\\\\Lock\\\\DBLockingProvider))\\n#10 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/View.php(2022): OC\\\\Files\\\\View->lockPath('\\\/01-Arbeit\\\/01-B...', 1)\\n#11 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/View.php(1309): OC\\\\Files\\\\View->lockFile('01-Arbeit\\\/01-Be...', 1)\\n#12 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/View.php(1356): OC\\\\Files\\\\View->getCacheEntry(Object(OCA\\\\Files_Trashbin\\\\Storage), 'files\\\/01-Arbeit...', '01-Arbeit\\\/01-Be...')\\n#13 \\\/var\\\/www\\\/cloud\\\/apps\\\/dav\\\/lib\\\/Connector\\\/Sabre\\\/ObjectTree.php(160): OC\\\\Files\\\\View->getFileInfo('\\\/maxibanki\\\/file...')\\n#14 \\\/var\\\/www\\\/cloud\\\/apps\\\/dav\\\/lib\\\/Connector\\\/Sabre\\\/LockPlugin.php(59): OCA\\\\DAV\\\\Connector\\\\Sabre\\\\ObjectTree->getNodeForPath('01-Arbeit\\\/01-Be...')\\n#15 [internal function]: OCA\\\\DAV\\\\Connector\\\\Sabre\\\\LockPlugin->getLock(Object(Sabre\\\\HTTP\\\\Request), Object(Sabre\\\\HTTP\\\\Response))\\n#16 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/sabre\\\/event\\\/lib\\\/EventEmitterTrait.php(105): call_user_func_array(Array, Array)\\n#17 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Server.php(466): Sabre\\\\Event\\\\EventEmitter->emit('beforeMethod', Array)\\n#18 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Server.php(254): Sabre\\\\DAV\\\\Server->invokeMethod(Object(Sabre\\\\HTTP\\\\Request), Object(Sabre\\\\HTTP\\\\Response))\\n#19 \\\/var\\\/www\\\/cloud\\\/apps\\\/dav\\\/appinfo\\\/v1\\\/webdav.php(76): Sabre\\\\DAV\\\\Server->exec()\\n#20 \\\/var\\\/www\\\/cloud\\\/remote.php(162): require_once('\\\/var\\\/www\\\/cloud....')\\n#21 {main}\",\"File\":\"\\\/var\\\/www\\\/cloud\\\/3rdparty\\\/doctrine\\\/dbal\\\/lib\\\/Doctrine\\\/DBAL\\\/Driver\\\/AbstractMySQLDriver.php\",\"Line\":116}", "userAgent": "--", "version": "12.0.3.3" } ```

Browser log

Not relevant

obel1x commented 1 year ago

@lemarakis

cloud you try my proposal of #6899 (comment) and leave a thumb if it works? imo it would greatly reduce the risc to run into that error.

I checked this workaround. It seems that solve the locking problem, but the portal becomes a sluggish. I am experimenting with the indices, e.g. deleted the fs_storage_path_prefix from oc_filecache and it seems way better.

Maybe a misunderstanding. My proposal was : "Remove all indices but primary" (from oc_filecache, but only those!) and then add the given ones, so fs_storage_path_prefix should be deleted like every default index of that table anyway.

Like that,

  1. the indexsize should shrink dramatically
  2. the speed should improve
  3. the locking-problem should be greatly reduced

if after that, the speed of the portal is a problem, activate slowlog with explain an give the output. I did not see any slowlogs since than (without one single statement run by cronjob, taking 2 seks)

lemarakis commented 1 year ago

@obel1x

Maybe a misunderstanding. My proposal was : "Remove all indices but primary" (from oc_filecache, but only those!) and then add the given ones, so fs_storage_path_prefix should be deleted like every default index of that table anyway.

Like that,

  1. the indexsize should shrink dramatically
  2. the speed should improve
  3. the locking-problem should be greatly reduced

if after that, the speed of the portal is a problem, activate slowlog with explain an give the output. I did not see any slowlogs since than (without one single statement run by cronjob, taking 2 seks)

I am testing with the following index schema (as you suggest):

PRIMARY KEY USING BTREE (fileid), KEY path_hash USING BTREE (path_hash), KEY fs_parent USING BTREE (parent) No DB deadlocks for the time being and the site responds fast enough. I will stay with this setup, until devs solve it (maybe they must redesign the oc_filecache table and/or indices ).

0x4A-0x41-0x4B commented 1 year ago

This is just making nc completely useless as a file sync tool. Does anyone know the specific version where this bug was (re)introduced? Or is there at least an official workaround for now?

obel1x commented 1 year ago

@0x4A-0x41-0x4B Well, it is nasty and annoying and also slows down machine and consumes disk space and CPU and memory resources, but its not fatal, as restarts are done and somewhen will succeed. I think the basic problem is a bad designed filecache table from the beginning of that software (no normalisation for directories). Than someone added those indices which i consider to be useless. With growing complexity of nextcloud the problem got bigger - well can you try my solution https://github.com/nextcloud/server/issues/6899#issuecomment-1377954159 and leave a thumb up to vote for a change?

lemarakis commented 1 year ago

Same problems with the latest 25.0.4

Snag_127c2189

HackLotus commented 1 year ago

@obel1x

Maybe a misunderstanding. My proposal was : "Remove all indices but primary" (from oc_filecache, but only those!) and then add the given ones, so fs_storage_path_prefix should be deleted like every default index of that table anyway. Like that,

  1. the indexsize should shrink dramatically
  2. the speed should improve
  3. the locking-problem should be greatly reduced

if after that, the speed of the portal is a problem, activate slowlog with explain an give the output. I did not see any slowlogs since than (without one single statement run by cronjob, taking 2 seks)

I am testing with the following index schema (as you suggest):

PRIMARY KEY USING BTREE (fileid), KEY path_hash USING BTREE (path_hash), KEY fs_parent USING BTREE (parent) No DB deadlocks for the time being and the site responds fast enough. I will stay with this setup, until devs solve it (maybe they must redesign the oc_filecache table and/or indices ).

Hello, is it possible for you to give de command line you type ? I'm not very okay with db manipulations...

lemarakis commented 1 year ago

Hello, is it possible for you to give de command line you type ? I'm not very okay with db manipulations...

Connect to the DB with your favourite DB Manager (e.g. phpMyAdmin) and execute the following :

ALTER TABLE oc_filecache DROP INDEX fs_id_storage_size, DROP INDEX fs_mtime, DROP INDEX fs_size, DROP INDEX fs_storage_path_prefix, ADD INDEX path_hash (path_hash), ADD INDEX fs_parent (parent);

You have to DROP all indexes but PRIMARY, and ADD the path_hash, fs_parent.

PS. Of course you take a full DB Backup before !

HackLotus commented 1 year ago

ALTER TABLE oc_filecache DROP INDEX fs_id_storage_size, DROP INDEX fs_mtime, DROP INDEX fs_size, DROP INDEX fs_storage_path_prefix, ADD INDEX path_hash (path_hash), ADD INDEX fs_parent (parent);

Thank you ! I will see if that solve the problem

Lokutos commented 1 year ago

Hello, is it possible for you to give de command line you type ? I'm not very okay with db manipulations...

Connect to the DB with your favourite DB Manager (e.g. phpMyAdmin) and execute the following :

ALTER TABLE oc_filecache DROP INDEX fs_id_storage_size, DROP INDEX fs_mtime, DROP INDEX fs_size, DROP INDEX fs_storage_path_prefix, ADD INDEX path_hash (path_hash), ADD INDEX fs_parent (parent);

You have to DROP all indexes but PRIMARY, and ADD the path_hash, fs_parent.

PS. Of course you take a full DB Backup before !

Works on Server version: 10.5.18-MariaDB-0+deb11u1 Debian 11 but only in combination with 'bulkupload.enabled' => false,

fenris commented 1 year ago

Hello, is it possible for you to give de command line you type ? I'm not very okay with db manipulations...

Connect to the DB with your favourite DB Manager (e.g. phpMyAdmin) and execute the following :

ALTER TABLE oc_filecache DROP INDEX fs_id_storage_size, DROP INDEX fs_mtime, DROP INDEX fs_size, DROP INDEX fs_storage_path_prefix, ADD INDEX path_hash (path_hash), ADD INDEX fs_parent (parent);

You have to DROP all indexes but PRIMARY, and ADD the path_hash, fs_parent.

PS. Of course you take a full DB Backup before !

is this the temporary fix? did anyone try this?

dvaerum commented 1 year ago

Based on the thread, the issue is 5 years old, so calling it a "temporary fix" would be a stretch, I think :cry: I ended up switching to Postgres, but since it was reported to be a little better. I hope I will not encounter the problem again with Postgres

Jason-Linux commented 1 year ago

Hello I Have the same problem Version 26.0.6; PHP 8.0.28, Mysql 8.0.33.

Error : "An exception occurred while executing a query: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction"}}

Its server from 24.0.2 to 26.0.2 step by step

It cause lot problem with RAM and CPU usage Someone can help ?

obel1x commented 1 year ago

@Jason-Linux currently this is the best approach i think: https://github.com/nextcloud/server/issues/6899#issuecomment-1489886264 try it and get back if it fails.

Jason-Linux commented 1 year ago

@obel1x i tried before but same result

keithf4 commented 1 year ago

Still seeing these deadlock issues on 25.0.7, php 8.1.20, PostgreSQL 15.3. I'm due to update to version 26 in the near future, but i see others above still reporting it on that version.

While removing indexes could help solve speed issues, I don't see how that would affect the deadlock problem that was originally reported. That's due entirely to the order that writes are being done to the database by concurrent transactions. I run a very small server, so I'm seeing almost zero performance issues even with the locks, it just creates a mess in the logs. And for PostgreSQL at least, it's a rather serious problem because of the conflicting transactions, there's no guarantee which one of them actually succeeds.

[no app in context] Warning: OC\DB\Exceptions\DbalException: An exception occurred while executing a query: SQLSTATE[40P01]: Deadlock detected: 7 ERROR:  deadlock detected
DETAIL:  Process 42820 waits for ShareLock on transaction 7552219; blocked by process 42818.
Process 42818 waits for ShareLock on transaction 7552210; blocked by process 42820.
HINT:  See server log for query details.
CONTEXT:  while rechecking updated tuple (813,6) in relation "oc_filecache" at <<closure>>

 0. /usr/local/www/apache24/nextcloud_2507/lib/private/DB/QueryBuilder/QueryBuilder.php line 329
    OC\DB\Exceptions\DbalException::wrap(["Doctrine\\DBAL ... "])
 1. /usr/local/www/apache24/nextcloud_2507/lib/private/Files/Cache/Propagator.php line 137
    OC\DB\QueryBuilder\QueryBuilder->executeStatement()
 2. /usr/local/www/apache24/nextcloud_2507/lib/private/Files/Cache/HomePropagator.php line 48
    OC\Files\Cache\Propagator->propagateChange("*** sensitive parameters replaced ***", 1686969556, 540)
 3. /usr/local/www/apache24/nextcloud_2507/lib/private/Files/Cache/Updater.php line 144
    OC\Files\Cache\HomePropagator->propagateChange("*** sensitive parameters replaced ***", 1686969556, 540)
 4. /usr/local/www/apache24/nextcloud_2507/apps/dav/lib/Connector/Sabre/File.php line 368
    OC\Files\Cache\Updater->update("*** sensitive parameters replaced ***")
 5. /usr/local/www/apache24/nextcloud_2507/apps/dav/lib/Connector/Sabre/Directory.php line 151
    OCA\DAV\Connector\Sabre\File->put(null)
 6. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/Server.php line 1098
    OCA\DAV\Connector\Sabre\Directory->createFile("digits.c", null)
 7. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/CorePlugin.php line 504
    Sabre\DAV\Server->createFile("files/tyriaan/m ... c", null, null)
 8. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/event/lib/WildcardEmitterTrait.php line 89
    Sabre\DAV\CorePlugin->httpPut(["Sabre\\HTTP\\Request"], ["Sabre\\HTTP\\Response"])
 9. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/Server.php line 472
    Sabre\DAV\Server->emit("method:PUT", [["Sabre\\HTTP\\ ... ]])
10. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/Server.php line 253
    Sabre\DAV\Server->invokeMethod(["Sabre\\HTTP\\Request"], ["Sabre\\HTTP\\Response"])
11. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/Server.php line 321
    Sabre\DAV\Server->start()
12. /usr/local/www/apache24/nextcloud_2507/apps/dav/lib/Server.php line 360
    Sabre\DAV\Server->exec()
13. /usr/local/www/apache24/nextcloud_2507/apps/dav/appinfo/v2/remote.php line 35
    OCA\DAV\Server->exec()
14. /usr/local/www/apache24/nextcloud_2507/remote.php line 172
    require_once("/usr/local/www/ ... p")
szaimen commented 1 year ago

Hi, there have been patches to address this issue that will land in 26.0.3. Can you please verify if 26.0.3-rc1 improves the situation around this issue?

keithf4 commented 1 year ago

Just curious if you mean this patch?

https://github.com/nextcloud/server/pull/38479 https://github.com/nextcloud/server/pull/37820

I'm not that familiar with this DB library in php, but just wondering if this is just ignoring the deadlock report from the database or actually preventing it? If not the latter, while this might remove the error from the nextcloud/php logs, you're not actually preventing the deadlock and the error will still spam the database logs and cause unpredictable behavior.

joshtrichards commented 1 year ago

@keithf4 I'm not certain which PRs @szaimen was referring to specifically, but, yes, those would be two relevant ones from the looks of them.

The referenced PRs look to be for a specific case where mtime (modify time), for the same folder, is being updated by multiple concurrent requests. It makes little sense to hard fail (nor wait or retry further) on a deadlock in this scenario since they'll effectively cancel each other out (that's my interpretation anyway - I was not involved in the implementation):

// with failures concurrent updates, someone else would have already done it.`
// in the worst case the storage_mtime isn't updated, which should at most only trigger an extra rescan

Also, one of those PRs adds explicit logging (at the WARN level) when this happens. The logs for this particular case will be cleaner (i.e. no repeated unhandled exceptions and now only at a specific loglevel so there's some flexibility/control).

Progress is often incremental. As this thread has shown, there are multiple contributing factors and therefore multiple non-mutually exclusive angles to attack this from. Reducing the impact of scenarios where deadlocks still can arise (where possible) is one such angle of attack. Obviously in addition to the other angle: reducing their frequency.

The referenced PRs are an example of the former by my read.

keithf4 commented 1 year ago

Appreciate that that is what it's doing and may help mitigate that it keeps retrying more often, but just on an initial read it simply looked like it was reclassifying the error and passing on it which was very concerning. While I hope it improves things, the source of the problem seems much more complex as noted below and part of me is a little concerned that it's not making clear in the Nextcloud logs that an actual deadlock event is happening.

https://github.com/nextcloud/server/issues/29985#issuecomment-1531535714

Badb0yBadb0y commented 1 year ago

@szaimen

Hi, there have been patches to address this issue that will land in 26.0.3. Can you please verify if 26.0.3-rc1 improves the situation around this issue?

has this patch been implemented in the version 27? I have a clear new install and I have this issue still.

joshtrichards commented 1 year ago

@Badb0yBadb0y Yes, the recently referenced PRs in this thread are in NC27. Keep in mind https://github.com/nextcloud/server/issues/6899#issuecomment-1601020184 - there are various circumstances deadlocks can occur under. These PRs are incremental improvements. You may still encounter deadlocks under various circumstances.

P.S. Also, which PRs appear in which releases can be found at: https://nextcloud.com/changelog/

Badb0yBadb0y commented 1 year ago

Seems like my issue was a galera cluster behind haproxy, I had to set 1 node out of the 2 as backup so can't happen double write as stated in one of the owncloud documentation.

nextcloud-command commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions.

Tom5421 commented 4 months ago

Are there any fixes to this issue?

My instance now has a deadlock and I don't know how to fix the issue. I'm unable to delete files from a folder.

lemarakis commented 1 month ago

After 1 year without issues and now having reached version 29.0.5, I decided to recreate the original structure of the oc_filecache table and test using the same sample with 20,000 small ico files.

Immediately, DeadlockException, Serialization failure messages started again... as well as corrupted uploads.

This happens on all three of our production servers (Debian 11, Apache 2.4.56, PHP 8.2, MariaDB 10.5.21)

The only workaround that works for me is deleting the indexes as shown in post https://github.com/nextcloud/server/issues/6899#issuecomment-1450857714

Specifically, the only three indexes I have now in table oc_filecache are the following: PRIMARY KEY USING BTREE (fileid), UNIQUE KEY fs_storage_path_hash USING BTREE (storage, path_hash), KEY fs_parent USING BTREE (parent)