Closed mxschmitt closed 1 year ago
@lemarakis
cloud you try my proposal of #6899 (comment) and leave a thumb if it works? imo it would greatly reduce the risc to run into that error.
I checked this workaround. It seems that solve the locking problem, but the portal becomes a sluggish. I am experimenting with the indices, e.g. deleted the fs_storage_path_prefix from oc_filecache and it seems way better.
Maybe a misunderstanding. My proposal was : "Remove all indices but primary" (from oc_filecache, but only those!) and then add the given ones, so fs_storage_path_prefix should be deleted like every default index of that table anyway.
Like that,
if after that, the speed of the portal is a problem, activate slowlog with explain an give the output. I did not see any slowlogs since than (without one single statement run by cronjob, taking 2 seks)
@obel1x
Maybe a misunderstanding. My proposal was : "Remove all indices but primary" (from oc_filecache, but only those!) and then add the given ones, so fs_storage_path_prefix should be deleted like every default index of that table anyway.
Like that,
- the indexsize should shrink dramatically
- the speed should improve
- the locking-problem should be greatly reduced
if after that, the speed of the portal is a problem, activate slowlog with explain an give the output. I did not see any slowlogs since than (without one single statement run by cronjob, taking 2 seks)
I am testing with the following index schema (as you suggest):
PRIMARY KEY USING BTREE (fileid), KEY path_hash USING BTREE (path_hash), KEY fs_parent USING BTREE (parent)
No DB deadlocks for the time being and the site responds fast enough.
I will stay with this setup, until devs solve it (maybe they must redesign the oc_filecache table and/or indices ).
This is just making nc completely useless as a file sync tool. Does anyone know the specific version where this bug was (re)introduced? Or is there at least an official workaround for now?
@0x4A-0x41-0x4B Well, it is nasty and annoying and also slows down machine and consumes disk space and CPU and memory resources, but its not fatal, as restarts are done and somewhen will succeed. I think the basic problem is a bad designed filecache table from the beginning of that software (no normalisation for directories). Than someone added those indices which i consider to be useless. With growing complexity of nextcloud the problem got bigger - well can you try my solution https://github.com/nextcloud/server/issues/6899#issuecomment-1377954159 and leave a thumb up to vote for a change?
Same problems with the latest 25.0.4
@obel1x
Maybe a misunderstanding. My proposal was : "Remove all indices but primary" (from oc_filecache, but only those!) and then add the given ones, so fs_storage_path_prefix should be deleted like every default index of that table anyway. Like that,
- the indexsize should shrink dramatically
- the speed should improve
- the locking-problem should be greatly reduced
if after that, the speed of the portal is a problem, activate slowlog with explain an give the output. I did not see any slowlogs since than (without one single statement run by cronjob, taking 2 seks)
I am testing with the following index schema (as you suggest):
PRIMARY KEY USING BTREE (fileid), KEY path_hash USING BTREE (path_hash), KEY fs_parent USING BTREE (parent)
No DB deadlocks for the time being and the site responds fast enough. I will stay with this setup, until devs solve it (maybe they must redesign the oc_filecache table and/or indices ).
Hello, is it possible for you to give de command line you type ? I'm not very okay with db manipulations...
Hello, is it possible for you to give de command line you type ? I'm not very okay with db manipulations...
Connect to the DB with your favourite DB Manager (e.g. phpMyAdmin) and execute the following :
ALTER TABLE oc_filecache DROP INDEX fs_id_storage_size, DROP INDEX fs_mtime, DROP INDEX fs_size, DROP INDEX fs_storage_path_prefix, ADD INDEX path_hash (path_hash), ADD INDEX fs_parent (parent);
You have to DROP all indexes but PRIMARY, and ADD the path_hash, fs_parent.
PS. Of course you take a full DB Backup before !
ALTER TABLE oc_filecache DROP INDEX fs_id_storage_size, DROP INDEX fs_mtime, DROP INDEX fs_size, DROP INDEX fs_storage_path_prefix, ADD INDEX path_hash (path_hash), ADD INDEX fs_parent (parent);
Thank you ! I will see if that solve the problem
Hello, is it possible for you to give de command line you type ? I'm not very okay with db manipulations...
Connect to the DB with your favourite DB Manager (e.g. phpMyAdmin) and execute the following :
ALTER TABLE oc_filecache DROP INDEX fs_id_storage_size, DROP INDEX fs_mtime, DROP INDEX fs_size, DROP INDEX fs_storage_path_prefix, ADD INDEX path_hash (path_hash), ADD INDEX fs_parent (parent);
You have to DROP all indexes but PRIMARY, and ADD the path_hash, fs_parent.
PS. Of course you take a full DB Backup before !
Works on Server version: 10.5.18-MariaDB-0+deb11u1 Debian 11
but only in combination with
'bulkupload.enabled' => false,
Hello, is it possible for you to give de command line you type ? I'm not very okay with db manipulations...
Connect to the DB with your favourite DB Manager (e.g. phpMyAdmin) and execute the following :
ALTER TABLE oc_filecache DROP INDEX fs_id_storage_size, DROP INDEX fs_mtime, DROP INDEX fs_size, DROP INDEX fs_storage_path_prefix, ADD INDEX path_hash (path_hash), ADD INDEX fs_parent (parent);
You have to DROP all indexes but PRIMARY, and ADD the path_hash, fs_parent.
PS. Of course you take a full DB Backup before !
is this the temporary fix? did anyone try this?
Based on the thread, the issue is 5 years old, so calling it a "temporary fix" would be a stretch, I think :cry: I ended up switching to Postgres, but since it was reported to be a little better. I hope I will not encounter the problem again with Postgres
Hello I Have the same problem Version 26.0.6; PHP 8.0.28, Mysql 8.0.33.
Error : "An exception occurred while executing a query: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction"}}
Its server from 24.0.2 to 26.0.2 step by step
It cause lot problem with RAM and CPU usage Someone can help ?
@Jason-Linux currently this is the best approach i think: https://github.com/nextcloud/server/issues/6899#issuecomment-1489886264 try it and get back if it fails.
@obel1x i tried before but same result
Still seeing these deadlock issues on 25.0.7, php 8.1.20, PostgreSQL 15.3. I'm due to update to version 26 in the near future, but i see others above still reporting it on that version.
While removing indexes could help solve speed issues, I don't see how that would affect the deadlock problem that was originally reported. That's due entirely to the order that writes are being done to the database by concurrent transactions. I run a very small server, so I'm seeing almost zero performance issues even with the locks, it just creates a mess in the logs. And for PostgreSQL at least, it's a rather serious problem because of the conflicting transactions, there's no guarantee which one of them actually succeeds.
[no app in context] Warning: OC\DB\Exceptions\DbalException: An exception occurred while executing a query: SQLSTATE[40P01]: Deadlock detected: 7 ERROR: deadlock detected
DETAIL: Process 42820 waits for ShareLock on transaction 7552219; blocked by process 42818.
Process 42818 waits for ShareLock on transaction 7552210; blocked by process 42820.
HINT: See server log for query details.
CONTEXT: while rechecking updated tuple (813,6) in relation "oc_filecache" at <<closure>>
0. /usr/local/www/apache24/nextcloud_2507/lib/private/DB/QueryBuilder/QueryBuilder.php line 329
OC\DB\Exceptions\DbalException::wrap(["Doctrine\\DBAL ... "])
1. /usr/local/www/apache24/nextcloud_2507/lib/private/Files/Cache/Propagator.php line 137
OC\DB\QueryBuilder\QueryBuilder->executeStatement()
2. /usr/local/www/apache24/nextcloud_2507/lib/private/Files/Cache/HomePropagator.php line 48
OC\Files\Cache\Propagator->propagateChange("*** sensitive parameters replaced ***", 1686969556, 540)
3. /usr/local/www/apache24/nextcloud_2507/lib/private/Files/Cache/Updater.php line 144
OC\Files\Cache\HomePropagator->propagateChange("*** sensitive parameters replaced ***", 1686969556, 540)
4. /usr/local/www/apache24/nextcloud_2507/apps/dav/lib/Connector/Sabre/File.php line 368
OC\Files\Cache\Updater->update("*** sensitive parameters replaced ***")
5. /usr/local/www/apache24/nextcloud_2507/apps/dav/lib/Connector/Sabre/Directory.php line 151
OCA\DAV\Connector\Sabre\File->put(null)
6. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/Server.php line 1098
OCA\DAV\Connector\Sabre\Directory->createFile("digits.c", null)
7. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/CorePlugin.php line 504
Sabre\DAV\Server->createFile("files/tyriaan/m ... c", null, null)
8. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/event/lib/WildcardEmitterTrait.php line 89
Sabre\DAV\CorePlugin->httpPut(["Sabre\\HTTP\\Request"], ["Sabre\\HTTP\\Response"])
9. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/Server.php line 472
Sabre\DAV\Server->emit("method:PUT", [["Sabre\\HTTP\\ ... ]])
10. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/Server.php line 253
Sabre\DAV\Server->invokeMethod(["Sabre\\HTTP\\Request"], ["Sabre\\HTTP\\Response"])
11. /usr/local/www/apache24/nextcloud_2507/3rdparty/sabre/dav/lib/DAV/Server.php line 321
Sabre\DAV\Server->start()
12. /usr/local/www/apache24/nextcloud_2507/apps/dav/lib/Server.php line 360
Sabre\DAV\Server->exec()
13. /usr/local/www/apache24/nextcloud_2507/apps/dav/appinfo/v2/remote.php line 35
OCA\DAV\Server->exec()
14. /usr/local/www/apache24/nextcloud_2507/remote.php line 172
require_once("/usr/local/www/ ... p")
Hi, there have been patches to address this issue that will land in 26.0.3. Can you please verify if 26.0.3-rc1 improves the situation around this issue?
Just curious if you mean this patch?
https://github.com/nextcloud/server/pull/38479 https://github.com/nextcloud/server/pull/37820
I'm not that familiar with this DB library in php, but just wondering if this is just ignoring the deadlock report from the database or actually preventing it? If not the latter, while this might remove the error from the nextcloud/php logs, you're not actually preventing the deadlock and the error will still spam the database logs and cause unpredictable behavior.
@keithf4 I'm not certain which PRs @szaimen was referring to specifically, but, yes, those would be two relevant ones from the looks of them.
The referenced PRs look to be for a specific case where mtime
(modify time), for the same folder, is being updated by multiple concurrent requests. It makes little sense to hard fail (nor wait or retry further) on a deadlock in this scenario since they'll effectively cancel each other out (that's my interpretation anyway - I was not involved in the implementation):
// with failures concurrent updates, someone else would have already done it.`
// in the worst case the storage_mtime isn't updated, which should at most only trigger an extra rescan
Also, one of those PRs adds explicit logging (at the WARN level) when this happens. The logs for this particular case will be cleaner (i.e. no repeated unhandled exceptions and now only at a specific loglevel so there's some flexibility/control).
Progress is often incremental. As this thread has shown, there are multiple contributing factors and therefore multiple non-mutually exclusive angles to attack this from. Reducing the impact of scenarios where deadlocks still can arise (where possible) is one such angle of attack. Obviously in addition to the other angle: reducing their frequency.
The referenced PRs are an example of the former by my read.
Appreciate that that is what it's doing and may help mitigate that it keeps retrying more often, but just on an initial read it simply looked like it was reclassifying the error and passing on it which was very concerning. While I hope it improves things, the source of the problem seems much more complex as noted below and part of me is a little concerned that it's not making clear in the Nextcloud logs that an actual deadlock event is happening.
https://github.com/nextcloud/server/issues/29985#issuecomment-1531535714
@szaimen
Hi, there have been patches to address this issue that will land in 26.0.3. Can you please verify if 26.0.3-rc1 improves the situation around this issue?
has this patch been implemented in the version 27? I have a clear new install and I have this issue still.
@Badb0yBadb0y Yes, the recently referenced PRs in this thread are in NC27. Keep in mind https://github.com/nextcloud/server/issues/6899#issuecomment-1601020184 - there are various circumstances deadlocks can occur under. These PRs are incremental improvements. You may still encounter deadlocks under various circumstances.
P.S. Also, which PRs appear in which releases can be found at: https://nextcloud.com/changelog/
Seems like my issue was a galera cluster behind haproxy, I had to set 1 node out of the 2 as backup so can't happen double write as stated in one of the owncloud documentation.
This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions.
Are there any fixes to this issue?
My instance now has a deadlock and I don't know how to fix the issue. I'm unable to delete files from a folder.
After 1 year without issues and now having reached version 29.0.5, I decided to recreate the original structure of the oc_filecache table and test using the same sample with 20,000 small ico files.
Immediately, DeadlockException, Serialization failure messages started again... as well as corrupted uploads.
This happens on all three of our production servers (Debian 11, Apache 2.4.56, PHP 8.2, MariaDB 10.5.21)
The only workaround that works for me is deleting the indexes as shown in post https://github.com/nextcloud/server/issues/6899#issuecomment-1450857714
Specifically, the only three indexes I have now in table oc_filecache are the following:
PRIMARY KEY USING BTREE (fileid
),
UNIQUE KEY fs_storage_path_hash
USING BTREE (storage
, path_hash
),
KEY fs_parent
USING BTREE (parent
)
Steps to reproduce
Expected behaviour
No error
Actual behaviour
Server configuration
Operating system: Debian 9.1 PXE:
Web server: nginx/1.10.3
Database: MariaDB
PHP version: 7.1.9
Nextcloud version: 12.0.3
Updated from an older Nextcloud/ownCloud or fresh install: Fresh Install
Where did you install Nextcloud from: https://download.nextcloud.com/server/releases/nextcloud-12.0.3.zip
Signing status:
Signing status
`No errors have been found.`List of activated apps:
App list
``` Enabled: - activity: 2.5.2 - bruteforcesettings: 1.0.2 - comments: 1.2.0 - dav: 1.3.0 - federatedfilesharing: 1.2.0 - federation: 1.2.0 - files: 1.7.2 - files_pdfviewer: 1.1.1 - files_sharing: 1.4.0 - files_texteditor: 2.4.1 - files_trashbin: 1.2.0 - files_versions: 1.5.0 - files_videoplayer: 1.1.0 - firstrunwizard: 2.1 - gallery: 17.0.0 - logreader: 2.0.0 - lookup_server_connector: 1.0.0 - nextcloud_announcements: 1.1 - notifications: 2.0.0 - oauth2: 1.0.5 - password_policy: 1.2.2 - provisioning_api: 1.2.0 - serverinfo: 1.2.0 - sharebymail: 1.2.0 - survey_client: 1.0.0 - systemtags: 1.2.0 - theming: 1.3.0 - twofactor_backupcodes: 1.1.1 - updatenotification: 1.2.0 - workflowengine: 1.2.0 Disabled: - admin_audit - encryption - files_external - user_external - user_ldap ```Nextcloud configuration:
Config report
``` { "system": { "instanceid": "ocqw4m1v6kk0", "passwordsalt": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "trusted_domains": [ "cloud" ], "datadirectory": "\/var\/www\/cloud\/data", "overwrite.cli.url": "https:\/\/cloud", "dbtype": "mysql", "version": "12.0.3.3", "dbname": "nextcloud", "dbhost": "10.10.10.6", "dbport": "", "dbtableprefix": "oc_", "mysql.utf8mb4": true, "dbuser": "***REMOVED SENSITIVE VALUE***", "dbpassword": "***REMOVED SENSITIVE VALUE***", "installed": true, "mail_from_address": "cloud", "mail_smtpmode": "php", "mail_smtpauthtype": "LOGIN", "mail_domain": "schmitt.mx" } } ```Are you using external storage, if yes which one: local/smb/sftp/.. No.
Are you using encryption: No
Are you using an external user-backend, if yes which one: LDAP/ActiveDirectory/Webdav/... Only webdav
Client configuration
Browser: Chrome 62
Operating system: Windows 7 Enterprise
Logs
Web server error log
Web server error log
``` No relevant logs ```Nextcloud log (data/nextcloud.log)
Nextcloud log
``` { "reqId": "aGDjWevMFpuYXpWX3mH7", "level": 4, "time": "2017-10-22T08:38:10+00:00", "remoteAddr": "", "user": "maxibanki", "app": "webdav", "method": "PUT", "url": "\/remote.php\/webdav\/01-Arbeit\/01-Berufsschule\/00-Archiv\/Steinam-XSLTalt\/.git\/objects\/f1\/08a244c01ab24b81e8a17bf0576734776a3e81", "message": "Exception: {\"Exception\":\"Doctrine\\\\DBAL\\\\Exception\\\\DriverException\",\"Message\":\"An exception occurred while executing 'INSERT INTO `oc_file_locks` (`key`,`lock`,`ttl`) SELECT ?,?,? FROM `oc_file_locks` WHERE `key` = ? HAVING COUNT(*) = 0' with params [\\\"files\\\\\\\/3d31700630071b7bfc95805f8a422ab1\\\", 1, 1508665090, \\\"files\\\\\\\/3d31700630071b7bfc95805f8a422ab1\\\"]:\\n\\nSQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction\",\"Code\":0,\"Trace\":\"#0 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/doctrine\\\/dbal\\\/lib\\\/Doctrine\\\/DBAL\\\/DBALException.php(128): Doctrine\\\\DBAL\\\\Driver\\\\AbstractMySQLDriver->convertException('An exception oc...', Object(Doctrine\\\\DBAL\\\\Driver\\\\PDOException))\\n#1 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/doctrine\\\/dbal\\\/lib\\\/Doctrine\\\/DBAL\\\/Connection.php(1015): Doctrine\\\\DBAL\\\\DBALException::driverExceptionDuringQuery(Object(Doctrine\\\\DBAL\\\\Driver\\\\PDOMySql\\\\Driver), Object(Doctrine\\\\DBAL\\\\Driver\\\\PDOException), 'INSERT INTO `oc...', Array)\\n#2 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/DB\\\/Connection.php(213): Doctrine\\\\DBAL\\\\Connection->executeUpdate('INSERT INTO `oc...', Array, Array)\\n#3 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/DB\\\/Adapter.php(114): OC\\\\DB\\\\Connection->executeUpdate('INSERT INTO `oc...', Array)\\n#4 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/DB\\\/Connection.php(251): OC\\\\DB\\\\Adapter->insertIfNotExist('*PREFIX*file_lo...', Array, Array)\\n#5 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Lock\\\/DBLockingProvider.php(118): OC\\\\DB\\\\Connection->insertIfNotExist('*PREFIX*file_lo...', Array, Array)\\n#6 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Lock\\\/DBLockingProvider.php(163): OC\\\\Lock\\\\DBLockingProvider->initLockField('files\\\/3d3170063...', 1)\\n#7 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/Storage\\\/Common.php(704): OC\\\\Lock\\\\DBLockingProvider->acquireLock('files\\\/3d3170063...', 1)\\n#8 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/Storage\\\/Wrapper\\\/Wrapper.php(588): OC\\\\Files\\\\Storage\\\\Common->acquireLock('files\\\/01-Arbeit...', 1, Object(OC\\\\Lock\\\\DBLockingProvider))\\n#9 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/View.php(1908): OC\\\\Files\\\\Storage\\\\Wrapper\\\\Wrapper->acquireLock('files\\\/01-Arbeit...', 1, Object(OC\\\\Lock\\\\DBLockingProvider))\\n#10 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/View.php(2022): OC\\\\Files\\\\View->lockPath('\\\/01-Arbeit\\\/01-B...', 1)\\n#11 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/View.php(1309): OC\\\\Files\\\\View->lockFile('01-Arbeit\\\/01-Be...', 1)\\n#12 \\\/var\\\/www\\\/cloud\\\/lib\\\/private\\\/Files\\\/View.php(1356): OC\\\\Files\\\\View->getCacheEntry(Object(OCA\\\\Files_Trashbin\\\\Storage), 'files\\\/01-Arbeit...', '01-Arbeit\\\/01-Be...')\\n#13 \\\/var\\\/www\\\/cloud\\\/apps\\\/dav\\\/lib\\\/Connector\\\/Sabre\\\/ObjectTree.php(160): OC\\\\Files\\\\View->getFileInfo('\\\/maxibanki\\\/file...')\\n#14 \\\/var\\\/www\\\/cloud\\\/apps\\\/dav\\\/lib\\\/Connector\\\/Sabre\\\/LockPlugin.php(59): OCA\\\\DAV\\\\Connector\\\\Sabre\\\\ObjectTree->getNodeForPath('01-Arbeit\\\/01-Be...')\\n#15 [internal function]: OCA\\\\DAV\\\\Connector\\\\Sabre\\\\LockPlugin->getLock(Object(Sabre\\\\HTTP\\\\Request), Object(Sabre\\\\HTTP\\\\Response))\\n#16 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/sabre\\\/event\\\/lib\\\/EventEmitterTrait.php(105): call_user_func_array(Array, Array)\\n#17 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Server.php(466): Sabre\\\\Event\\\\EventEmitter->emit('beforeMethod', Array)\\n#18 \\\/var\\\/www\\\/cloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Server.php(254): Sabre\\\\DAV\\\\Server->invokeMethod(Object(Sabre\\\\HTTP\\\\Request), Object(Sabre\\\\HTTP\\\\Response))\\n#19 \\\/var\\\/www\\\/cloud\\\/apps\\\/dav\\\/appinfo\\\/v1\\\/webdav.php(76): Sabre\\\\DAV\\\\Server->exec()\\n#20 \\\/var\\\/www\\\/cloud\\\/remote.php(162): require_once('\\\/var\\\/www\\\/cloud....')\\n#21 {main}\",\"File\":\"\\\/var\\\/www\\\/cloud\\\/3rdparty\\\/doctrine\\\/dbal\\\/lib\\\/Doctrine\\\/DBAL\\\/Driver\\\/AbstractMySQLDriver.php\",\"Line\":116}", "userAgent": "--", "version": "12.0.3.3" } ```Browser log
Not relevant