matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.12k forks source link

AS scheduler may drop events during a restart #11637

Open Fizzadar opened 2 years ago

Fizzadar commented 2 years ago

Description

The AS scheduler keeps track of events to send to each AS in memory as they come in, only pushing these into the database (as an AS transaction) once any in-flight requests have been completed:

https://github.com/matrix-org/synapse/blob/c500bf37d660b08efb48501b7690dc4448b39eca/synapse/appservice/scheduler.py#L149-L173

Steps to reproduce

This means AS events could be lost during the following series of events:

Possible solution

Would it be possible to setup some kind of exit handling (atexit from stdlib?) that dumps any in-memory events into a new txn in the database before the process exits, this would prevent any loss of AS events.

reivilibre commented 2 years ago

Would it be possible to setup some kind of exit handling (atexit from stdlib?) that dumps any in-memory events into a new txn in the database before the process exits, this would prevent any loss of AS events.

This sounds like it won't really solve the problem if it's caused by a crash / power cut / etc.

My imagination of how this should work is that the AS scheduler shouldn't advance its stream position (in whatever source the events come from in the first place) until it has actually handled the events.

I still need to dig through the code and see what's going on — it's not an area I'm very familiar with but I can always use an excuse to get more familiar with it. :)