matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.12k forks source link

Presence logic for handling newly joined rooms is very inefficient and does duplicate work #8956

Open erikjohnston opened 3 years ago

erikjohnston commented 3 years ago

When a user joins a room Synapse needs to decide what presence updates to send out. For local users joining a room we need to send a presence update to all users in the room, for remote users joining we need to send out all local users presence to the remote user.

This is implemented by streaming the current state delta stream, however when the server joins a federated room for the first time the state delta will include join events for all members. This leads to a situation where we correctly process the fact that a local user has joined a room and send out presence updates, and then incorrectly process each remote join event as if they're a new user join, leading to a lot of unnecessary work and superfluous presence events being generated and sent over federation.

https://github.com/matrix-org/synapse/blob/6d02eb22dfde9551c515acaf73503e2500e00eaf/synapse/handlers/presence.py#L860-L953

On top of that, we should also only send presence updates when this is the first time a remote server shares a room with the local user.

anoadragon453 commented 3 years ago

This leads to a situation where we correctly process the fact that a local user has joined a room and send out presence updates, and then incorrectly process each remote join event as if they're a new user join, leading to a lot of unnecessary work and superfluous presence events being generated and sent over federation.

For the latter, what happens is that we get a large bundle of state deltas to process when we join a room. However we indeed end up processing all the existing remote joins as new user joins.

Save checking whether the joins are older than our local user join (which there may not be a reliable way to do so? stream_ordering, depth?) we can just drop the remote joins in this instance, as a local user join means we'll be sending presence to all current hosts in the room regardless.

anoadragon453 commented 3 years ago

The first half of this (not processing each remote join as a new server joining the room) was solved in https://github.com/matrix-org/synapse/pull/9402.

The second half (only sending presence updates when this is the first time a remote server shares a room with the local user), will be tackled in a separate PR.