matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.13k forks source link

Handling state_ids responses loads huge amounts of events into memory at once, causing OOMs #6597

Open richvdh opened 4 years ago

richvdh commented 4 years ago

For example: the response to matrix://matrix.org/_matrix/federation/v1/state_ids/%21jWRVQAlVGjigKCRGwS%3Amatrix.org?event_id=%241577162865947686RYsuB%3Amatrix.org includes an auth chain with 22327 events. surely there can't be that many events in $1577162865947686RYsuB:matrix.org's auth chain?

richvdh commented 4 years ago

I wonder if this has been caused (or exacerbated) by us deploying #6556

richvdh commented 4 years ago

right, so the problem here is that synapse is returning the auth_events for each of the events in the state of the room at the given event_id. This is consistent with the spec; however it seems to be utterly pointless. For a given event in the state, either:

richvdh commented 4 years ago

we don't have it, so will have to request it via /event which will tell us its auth events.

unfortunately, as of #7817, synapse won't recursively fetch auth events in this way :/

richvdh commented 4 years ago

On reflection, it's correct that we return all these auth events. The problem is that we load them all into memory at once.

richvdh commented 3 years ago

this might be a bit better since #9601, but it's still problematic

DMRobertson commented 1 year ago

To summarise:

Relevant bits of source:

https://github.com/matrix-org/synapse/blob/da2c93d4b69200c1ea9fb94ec3c951fd4b424864/synapse/federation/transport/server/federation.py#L171-L185

https://github.com/matrix-org/synapse/blob/1eed795fc56d95df3968e37f3a4db92f24513e15/synapse/federation/federation_server.py#L562-L581

https://github.com/matrix-org/synapse/blob/1eed795fc56d95df3968e37f3a4db92f24513e15/synapse/federation/federation_server.py#L583-L590