matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.13k forks source link

Fix bug where a new writer advances their token too quickly #16473

Closed erikjohnston closed 1 year ago

erikjohnston commented 1 year ago

When starting a new writer (for e.g. persisting events), the MultiWriterIdGenerator doesn't have a minimum token for it as there are no rows matching that new writer in the DB.

This results in the the first stream ID it acquired being announced as persisted before it actually finishes persisting, if another writer gets and persists a subsequent stream ID. This is due to the logic of setting the minimum persisted position to the minimum known position of across all writers, and the new writer starts off not being considered.

We fix this by setting the new writer to the minimum persisted position on startup.

erikjohnston commented 1 year ago

TestDeletingDeviceRemovesDeviceLocalNotificationSettings is failing on complement, but it looks like its failing for other PRs too?

DMRobertson commented 1 year ago

TestDeletingDeviceRemovesDeviceLocalNotificationSettings is failing on complement, but it looks like its failing for other PRs too?

Should be fixed already by https://github.com/matrix-org/complement/pull/670

erikjohnston commented 1 year ago

@DMRobertson turns out that fixing this then broke a complement test, as it hit the bug where we sometimes event stream can get stuck if a worker doesn't write anything. Have added a fix.

erikjohnston commented 1 year ago

@DMRobertson Whoops, looks like I messed up my rebase. The first three commits should be the same as you reviewed, the last two commits are new.

erikjohnston commented 1 year ago

@DMRobertson SORRY! I've had to change things a bit again. While writing a test I realised that the implementation allowed the stream IDs returned by get_curret_token_for_writer to go backwards. That seemed bad. So I've fixed it so that we track the max local stream position as a separate var, where its much more obvious that it can only advance.

erikjohnston commented 1 year ago

@DMRobertson I've added some words to point out that "current token" is non-unique, and that we use the minimum and maximum at times.