matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.13k forks source link

Bail during start-up if there are old background updates. #16397

Open clokep opened 1 year ago

clokep commented 1 year ago

This is a basic implementation to enforce that some old background updates have run.

My hope is that this could be used to avoid future situations like #16192. This would fix #16047 in a simplistic manner and doesn't attempt to differentiate between different types of background updates.

This still would require us to do thinking about what old updates are now required, but I think it would fix the issue where we have to continually create background jobs due to dependencies.

TODO

clokep commented 1 year ago

Putting up for review as I'd be interested in feedback on this. Is this approach worth pursuing?

erikjohnston commented 1 year ago

One thought I had was that we need to think a bit about how this interacts with security releases, i.e. if we put out a high severity fix would this potentially block large servers from upgrading (if they were still churning through a large BG update)?

DMRobertson commented 1 year ago

Similarly: suppose we write a bad background update that takes aaaaaaaaaaages to run. Months later we discover this, after we've bumped the schema version. Do we have the ability to patch the background update itself?

(See e.g. the rebuild user directory background update, for example)

clokep commented 1 year ago

I think the response to both those is the same -- you want the minimum background update version to trail ages behind your current release. So it really becomes a process thing.

You could still have an issue where a deploy of Synapse is also very far behind and then needs to jump due to security releases though, I don't have a good answer to that.

Similarly: suppose we write a bad background update that takes aaaaaaaaaaages to run. Months later we discover this, after we've bumped the schema version. Do we have the ability to patch the background update itself?

I think the (annoying) solution you do is a point release before you bump the required version. People can then update to that version with the improved performance.

clokep commented 1 year ago

Enforce that ordering is filled in (non-nullable).

ordering is NOT NULL, but has a default of 0. 🤷

clokep commented 1 year ago

I think this is ready for an initial review for real.