matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.12k forks source link

[Feature request] Room state database integrity check #12489

Open ShadowJonathan opened 2 years ago

ShadowJonathan commented 2 years ago

Description:

A background (and/or admin-submittable) task that will walk state groups, outliers, and other calculated information of relevant rooms, and validate it.

This will then rewrite, fetch, deny or invalidate any incorrect state, and tolerantly rebuild a state group table for a particular room.

Reasoning:

This would be a “last resort” button for administrators to push, to fix any inconsistencies in their database.

(Hypothesis) It could possibly also work with state resets, “revalidating” a room, or provide insightful information about some internal fault or bug that would lead to state resets.

reivilibre commented 2 years ago

It seems like a fun idea, but I'm also not sure that it makes much sense, or at least not when considering priority against other things.

If we think we have a problem causing this state to be corrupted, we should try and fix the root cause rather than paper over it with a repair tool.

If we think the cause is database (e.g. Postgres) corruption, then we could certainly run some sanity checks... but I doubt it is possible to fix any inconsistencies in the database; we'd be limited to a few that we can, and maybe even some of those would be pretty sketchy (since after all, the repair process is ingesting the same corrupt database to rebuild its state).

richvdh commented 2 years ago

this sort of thing has come up in the past, eg as a result of https://github.com/matrix-org/synapse/issues/9595, where we've fixed the underlying bug, but we still have databases in a mess. There is certainly merit to such a "rebuild a room" tool.