matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.13k forks source link

Consider de/prioritising certain events when processing the incoming federation staging area #12578

Open Gnuxie opened 2 years ago

Gnuxie commented 2 years ago

Description:

Context

There's no secret that recently "mass join spam" has been targeting rooms and causing a federation penalty. There are two problems that come from these attacks, the first is that the servers involved in the room struggle to process the number of join events and can fail. The second is that any reactive moderation action (to try alleviate the problem) is going to require a large number of the join events to be processed before it has any effect. Not only that, but the server taking the moderation action will then refer to all of the join events it has already processed in the auth chain, which means every other server has to process the same spam events before the action is perceived to take effect. This means there is a delay before the reactive moderation takes effect and provides a window where the attack continues while we wait.

Prioritisation

Why do we think that Prioritisation could help with this?

If servers were able to detect that another server as misbehaving and sending an abnormal number of spam joins, we could simply refuse to process events from the server (temporarily) and when events are created for moderation action, they will not have to refer to the problematic events in the auth chain. If this prioritisation was then standard across several servers, each of them will be able to benefit by processing the events (that are reactive moderation) without needing to process so much of the spam. This will require that the prioritisation system is aware and able to avoid processing incoming events that stack up into the "dirty", spammy auth chain.

What does "moderation action" mean, what scenarios does this help with?

Server ACL

This change would have the most effect with server ACL. If all the rooms server's are prioritizing events in a similar way, then ACL events could be propagated between them without needing to process the spammy events.

We then believe that if the spammy join events were federated to each server in the room directly from the spam server, the other servers in the room can then safely "drop" the deprioritised dirty auth-chain spam events as though they were never federated to them (while no other server has referred to them in prev_events etc, which they will not have if they observe the same prioritization rules). Even if the prioritization rules are observed slightly differently or it takes one group of servers longer to de-prioritize the incoming spam events, there is still the potential to greatly reduce the number of events that will actually make it into the the room.

Member bans

Would this make a difference for member bans in the situation where the spammy server hasn't modified Synapse to deliberately inject events at specific places in the DAG? Given that the moderating sever(s) have been unresponsive due to the number of events, yes. If they were able to keep up with the spam though, there doesn't seem to be a clear advantage.

Has something like this been tried before?

Apparently, but I am not sure whether the solution needs to be that complicated https://github.com/matrix-org/synapse/blob/hawkowl/fsb/docs/federation_side_bus.md

richvdh commented 2 years ago

Apparently, but I am not sure whether the solution needs to be that complicated https://github.com/matrix-org/synapse/blob/hawkowl/fsb/docs/federation_side_bus.md

FSB was mostly about outgoing federation, rather than incoming.