Closed ChristophWurst closed 8 months ago
Idea to locate code that relies on causal reads
1) Set up a local, containerized cluster with read, write split and enable async replication 2) Force the container network to have a few seconds of delay from the write to the read nodes 3) Use the application or run tests against it
Thanks to https://github.com/nextcloud/server/pull/42345 Nextcloud can now detect dirty reads natively and will log a debug message when this happens.
A quick clicking through active apps on my dev env:
42 "dirty table reads: SELECT DISTINCT `cp`.`cardid` FROM `*PREFIX*cards_properties` `cp` WHERE (`cp`.`addressbookid` IN (:dcValue1)) AND (`cp`.`name` IN (:dcValue2)) LIMIT 24: /nextcloud/apps/dav/lib/CardDAV/CardDavBackend.php:1082"
42 "dirty table reads: SELECT DISTINCT `cp`.`cardid` FROM `*PREFIX*cards_properties` `cp` WHERE (`cp`.`addressbookid` IN (:dcValue1)) AND (`cp`.`name` IN (:dcValue2)) AND (`cp`.`value` COLLATE utf8mb4_general_ci LIKE :dcValue3) LIMIT 1: /nextcloud/apps/dav/lib/CardDAV/CardDavBackend.php:1082"
31 "dirty table reads: SELECT * FROM `*PREFIX*dav_cal_proxy` WHERE `proxy_id` = :dcValue1: /nextcloud/lib/public/AppFramework/Db/QBMapper.php:335"
25 "dirty table reads: SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE `principaluri` = :dcValue1 ORDER BY `calendarorder` ASC: /nextcloud/apps/dav/lib/CalDAV/CalDavBackend.php:330"
25 "dirty table reads: SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `a`.`id`, `a`.`uri`, `a`.`synctoken`, `a`.`components`, `a`.`principaluri`, `a`.`transparent`, `s`.`access` FROM `*PREFIX*dav_shares` `s` INNER JOIN `*PREFIX*calendars` `a` ON `s`.`resourceid` = `a`.`id` WHERE (`s`.`principaluri` IN (:principaluri)) AND (`s`.`type` = :type): /nextcloud/apps/dav/lib/CalDAV/CalDavBackend.php:384"
24 "dirty table reads: SELECT `data` FROM `*PREFIX*accounts` WHERE `uid` = :uid: /nextcloud/lib/private/Accounts/AccountManager.php:297"
24 "dirty table reads: SELECT `c`.`addressbookid`, `c`.`carddata`, `c`.`uri` FROM `*PREFIX*cards` `c` WHERE `c`.`id` IN (:matches): /nextcloud/apps/dav/lib/CardDAV/CardDavBackend.php:1220"
16 "dirty table reads: SELECT `appid`, `configkey`, `configvalue` FROM `*PREFIX*preferences` WHERE `userid` = :dcValue1: /nextcloud/lib/private/AllConfig.php:434"
13 "dirty table reads: SELECT * FROM `*PREFIX*user_status` WHERE `user_id` = :dcValue1: /nextcloud/lib/public/AppFramework/Db/QBMapper.php:275"
12 "dirty table reads: SELECT `id`, `uri`, `displayname`, `principaluri`, `description`, `synctoken` FROM `*PREFIX*addressbooks` WHERE `principaluri` = :dcValue1: /nextcloud/lib/public/AppFramework/Db/TTransactional.php:63"
cat data/nextcloud.log | grep dirty | jq '.message + ": " + .exception.Trace[4].file + ":" + (.exception.Trace[4].line|tostring)' | sort | uniq -c | sort -nr | head
Edit: info is inaccurate because transacted reads were logged too.
tail -f data/nextcloud.log | grep dirty
If I create a calendar in Calendar, I get these:
{"reqId":"PCu8z0SjXduKf3xyJhU2","level":0,"time":"2024-01-29T16:49:07+00:00","remoteAddr":"127.0.0.1","user":"admin","app":"no app in context","method":"MKCOL","url":"/remote.php/dav/calendars/admin/aaa","message":"dirty table reads: SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE (`uri` = :dcValue1) AND (`principaluri` = :dcValue2) LIMIT 1","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0","version":"29.0.0.5","exception":{"Exception":"Exception","Message":"","Code":0,"Trace":[{"file":"/home/christoph/workspace/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php","line":337,"function":"executeQuery","class":"OC\\DB\\Connection","type":"->","args":["SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE (`uri` = :dcValue1) AND (`principaluri` = :dcValue2) LIMIT 1",["aaa","principals/users/admin"],[2,2],null]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php","line":377,"function":"executeQuery","class":"Doctrine\\DBAL\\Query\\QueryBuilder","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/lib/private/DB/QueryBuilder/QueryBuilder.php","line":280,"function":"execute","class":"Doctrine\\DBAL\\Query\\QueryBuilder","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/lib/private/DB/QueryBuilder/QueryBuilder.php","line":293,"function":"execute","class":"OC\\DB\\QueryBuilder\\QueryBuilder","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/lib/CalDAV/CalDavBackend.php","line":620,"function":"executeQuery","class":"OC\\DB\\QueryBuilder\\QueryBuilder","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/lib/CalDAV/CalendarHome.php","line":170,"function":"getCalendarByUri","class":"OCA\\DAV\\CalDAV\\CalDavBackend","type":"->","args":["principals/users/admin","aaa"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/CalDAV/CalendarHome.php","line":171,"function":"getChild","class":"OCA\\DAV\\CalDAV\\CalendarHome","type":"->","args":["aaa"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Tree.php","line":117,"function":"childExists","class":"Sabre\\CalDAV\\CalendarHome","type":"->","args":["aaa"]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/lib/Connector/Sabre/FilesPlugin.php","line":674,"function":"nodeExists","class":"Sabre\\DAV\\Tree","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/event/lib/WildcardEmitterTrait.php","line":89,"function":"sendFileIdHeader","class":"OCA\\DAV\\Connector\\Sabre\\FilesPlugin","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":1239,"function":"emit","class":"Sabre\\DAV\\Server","type":"->","args":["afterBind",["*** sensitive parameters replaced ***"]]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/CorePlugin.php","line":560,"function":"createCollection","class":"Sabre\\DAV\\Server","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/event/lib/WildcardEmitterTrait.php","line":89,"function":"httpMkcol","class":"Sabre\\DAV\\CorePlugin","type":"->","args":[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":472,"function":"emit","class":"Sabre\\DAV\\Server","type":"->","args":["method:MKCOL",[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":253,"function":"invokeMethod","class":"Sabre\\DAV\\Server","type":"->","args":[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":321,"function":"start","class":"Sabre\\DAV\\Server","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/lib/Server.php","line":370,"function":"exec","class":"Sabre\\DAV\\Server","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/appinfo/v2/remote.php","line":35,"function":"exec","class":"OCA\\DAV\\Server","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/remote.php","line":172,"args":["/home/christoph/workspace/nextcloud/apps/dav/appinfo/v2/remote.php"],"function":"require_once"}],"File":"/home/christoph/workspace/nextcloud/lib/private/DB/Connection.php","Line":279,"message":"dirty table reads: SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE (`uri` = :dcValue1) AND (`principaluri` = :dcValue2) LIMIT 1","tables":["oc_calendars"],"reads":["oc_calendars"],"exception":{},"CustomMessage":"dirty table reads: SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE (`uri` = :dcValue1) AND (`principaluri` = :dcValue2) LIMIT 1"}}
{"reqId":"PCu8z0SjXduKf3xyJhU2","level":0,"time":"2024-01-29T16:49:07+00:00","remoteAddr":"127.0.0.1","user":"admin","app":"no app in context","method":"MKCOL","url":"/remote.php/dav/calendars/admin/aaa","message":"dirty table reads: SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE (`uri` = :dcValue1) AND (`principaluri` = :dcValue2) LIMIT 1","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0","version":"29.0.0.5","exception":{"Exception":"Exception","Message":"","Code":0,"Trace":[{"file":"/home/christoph/workspace/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php","line":337,"function":"executeQuery","class":"OC\\DB\\Connection","type":"->","args":["SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE (`uri` = :dcValue1) AND (`principaluri` = :dcValue2) LIMIT 1",["aaa","principals/users/admin"],[2,2],null]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php","line":377,"function":"executeQuery","class":"Doctrine\\DBAL\\Query\\QueryBuilder","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/lib/private/DB/QueryBuilder/QueryBuilder.php","line":280,"function":"execute","class":"Doctrine\\DBAL\\Query\\QueryBuilder","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/lib/private/DB/QueryBuilder/QueryBuilder.php","line":293,"function":"execute","class":"OC\\DB\\QueryBuilder\\QueryBuilder","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/lib/CalDAV/CalDavBackend.php","line":620,"function":"executeQuery","class":"OC\\DB\\QueryBuilder\\QueryBuilder","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/lib/CalDAV/CalendarHome.php","line":170,"function":"getCalendarByUri","class":"OCA\\DAV\\CalDAV\\CalDavBackend","type":"->","args":["principals/users/admin","aaa"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Tree.php","line":83,"function":"getChild","class":"OCA\\DAV\\CalDAV\\CalendarHome","type":"->","args":["aaa"]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/lib/Connector/Sabre/FilesPlugin.php","line":677,"function":"getNodeForPath","class":"Sabre\\DAV\\Tree","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/event/lib/WildcardEmitterTrait.php","line":89,"function":"sendFileIdHeader","class":"OCA\\DAV\\Connector\\Sabre\\FilesPlugin","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":1239,"function":"emit","class":"Sabre\\DAV\\Server","type":"->","args":["afterBind",["*** sensitive parameters replaced ***"]]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/CorePlugin.php","line":560,"function":"createCollection","class":"Sabre\\DAV\\Server","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/event/lib/WildcardEmitterTrait.php","line":89,"function":"httpMkcol","class":"Sabre\\DAV\\CorePlugin","type":"->","args":[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":472,"function":"emit","class":"Sabre\\DAV\\Server","type":"->","args":["method:MKCOL",[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":253,"function":"invokeMethod","class":"Sabre\\DAV\\Server","type":"->","args":[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]},{"file":"/home/christoph/workspace/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":321,"function":"start","class":"Sabre\\DAV\\Server","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/lib/Server.php","line":370,"function":"exec","class":"Sabre\\DAV\\Server","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/apps/dav/appinfo/v2/remote.php","line":35,"function":"exec","class":"OCA\\DAV\\Server","type":"->","args":[]},{"file":"/home/christoph/workspace/nextcloud/remote.php","line":172,"args":["/home/christoph/workspace/nextcloud/apps/dav/appinfo/v2/remote.php"],"function":"require_once"}],"File":"/home/christoph/workspace/nextcloud/lib/private/DB/Connection.php","Line":279,"message":"dirty table reads: SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE (`uri` = :dcValue1) AND (`principaluri` = :dcValue2) LIMIT 1","tables":["oc_calendars"],"reads":["oc_calendars"],"exception":{},"CustomMessage":"dirty table reads: SELECT `displayname`, `description`, `timezone`, `calendarorder`, `calendarcolor`, `deleted_at`, `id`, `uri`, `synctoken`, `components`, `principaluri`, `transparent` FROM `*PREFIX*calendars` WHERE (`uri` = :dcValue1) AND (`principaluri` = :dcValue2) LIMIT 1"}}
We do this for all of our features and make a list of the dirty reads logged.
How to
1. Check out latest server master 2. Disable user_status to avoid running into [[Bug]: Dirty user_status table read server#43109](https://github.com/nextcloud/server/issues/43109) 3. `tail -f data/nextcloud.log | grep dirty` 4. Click yourself through a feature
To improve the readability of the output:
tail -f data/nextcloud.log | jq 'select(.message | startswith("dirty table reads:"))'
\OCA\DAV\CalDAV\CalDavBackend::updateCalendar
: Dispatches a partial UPDATE
to modify all mutated properties of a calendar and then proceeds to SELECT
the full calendar to emit an event. -> Could be fixed by SELECT
ing first, mutating the in-memory calendar data array and emitting it directly. [EASY]
\OCA\DAV\Connector\Sabre\FilesPlugin::sendFileIdHeader
: Is called trough an event emitter after creating a new calendar or event. The dirty read is caused by $this->server->tree->getNodeForPath($filePath)
which is unnecessary as the following condition $node instanceof \OCA\DAV\Connector\Sabre\Node
is never true and the whole SELECT
is obsolete. -> Could be fixed by checking if $filePath
starts with well known CalDAV prefixes (/dav/calendars/...
) and returning early. [EASY]\OCA\DAV\Connector\Sabre\FilesPlugin::sendFileIdHeader
: Is called for new vcards, too. Same fixes as above.\OCA\DAV\CardDAV\CardDavBackend::updateAddressBook
: Wrong transaction span
After some clicking I found the following. I'm not sure how exactly the dirty reads are caused in each case.
If the jobs are executed separately one by one with occ, then there is no dirty read output.
In Mail not a big issue. But could be a in other areas. Server Ticket: https://github.com/nextcloud/server/issues/43892
The best way is to refactor the sync process. To not write imap to db and then read from db. It's a huge effort.
Sort fix: \OCA\Mail\Service\AutoCompletion\AddressCollector::saveAddress
: Put exists()
and insert
in a Transaction: https://github.com/nextcloud/mail/pull/9422
Fix: Don't collect addresses in a extra table anymore. Use mail_messages or imap requests.
Fix: SELECT as subquery of DELETE in \OCA\Mail\Db\MessageMapper::deleteByUid
: https://github.com/nextcloud/mail/pull/9423
Short fix: \OCA\Mail\Db\LocalMessageMapper::updateWithRecipients
: move findByLocalMessageId()
in transaction above: https://github.com/nextcloud/mail/pull/9424
Fix: Edit \OCA\Mail\Db\RecipientMapper::updateRecipients
to return recipients and remove extra query findByLocalMessageId()
\OCA\DAV\CalDAV\CalDavBackend::updateCalendar
That method suffers from the same bug as https://github.com/nextcloud/server/pull/43903. There is a transaction but it's effectively unused. Sabre executes the callback when the transaction is already closed.
Everything we decided to fix has been addressed. The rest is in server.
Description
High performance database backends might split their reads and writes to different nodes to balance load and improve throughput. For an application to work with these setups, some assumptions about data consistencies have to be softened. One of them is to not assume written data is readable instantly.
https://mariadb.com/docs/multi-node/maxscale/routers/readwritesplit/ensure-causal-consistency-maxscale-read-write-split-router/ covers this topic. The technical term is "causal reads". To achieve it, our apps must avoid reading and relying on data that just had been written in another transaction, because it might not be available.
As an example, the Mail initial synchronization is split into incremental chunks and each chunk relies on data of the previous run. With simple, one-node databases and in the lucky case, the chunked process will already read correct data based on the previous inserts. In the not so lucky case we read outdated data. This will result in insertion conflicts of unique natural keys.
It is not always possible to avoid the read. In that case we have to put the related code into one database transaction. The database cluster will make sure that the main node processes everything in one go and no read-only node is involved.
Affected components
To do
From reading db codeFrom reading database cluster logs