matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.12k forks source link

synapse gets stuck in a loop with `Invalid prev_events for <event_id>` #11802

Closed richvdh closed 2 years ago

richvdh commented 2 years ago

If inbound federation traffic backs up in a v1 room, synapse starts spamming the logs with messages like:

2022-01-23 00:42:23,593 - synapse.storage.databases.main.event_federation - 1436 - INFO - _process_incoming_pdus_in_room_inner-1197913 - Invalid prev_events for <event_id>

this means that rather than dropping excess traffic as it should, it ends up in a tight loop and fails to process incoming traffic from other sources.

richvdh commented 2 years ago

see 2b9f741f3 which is patches this on the matrix-org-hotfixes branch

richvdh commented 2 years ago

(dates back to https://github.com/matrix-org/synapse/pull/10390)

richvdh commented 2 years ago

A workaround is to for all users on the server to leave the room containing the events (or to use the delete room admin API to shut it down).

Cknight70 commented 2 years ago

Just applied this patch. Previously my homeserver.log would get to 2gb in 30 minutes, Now it is only a few megabytes. Ram usage is cut in half, messages are reliably sending again, thanks.

Redmauss commented 2 years ago

I have this same issue, with synapse.storage.databases.main.event_federation - 1427 - INFO - _process_incoming_pdus_in_room_inner-4792 - Invalid prev_events for $164289566819600CTQgK:matrix.kiwifarms.net

being the log message. I also have been getting connection interrupts and obscene resource usage(100% CPU and maxing out 4GB mem). The logs were so constant that my server storage was filled up with a 153GB log file. I fixed this by making logging in /etc/matrix-synapse/log.yaml ERROR instead of INFO.

How do I apply the patch mentioned by @richvdh? Sorry I haven't made any changes to synapse besides the installation and setting up email and turnserver. Any help would be greatly appreciated.

Cknight70 commented 2 years ago

@Redmauss There is probably a more graceful way to do this with patch but I did the following:

  1. I found the location of synapse from the systemd service file /lib/systemd/system/matrix-synapse.service For me this was /opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/databases/main/
  2. cd into the directory and create a backup cp event_federation.py ~
  3. Stop matrix synapse sudo systemctl stop matrix-synapse
  4. Remove the old version sudo rm event_federation.py
  5. Download the patched version sudo wget https://raw.githubusercontent.com/matrix-org/synapse/2b9f741f3a9dee93e9744774b342c35ce60062c4/synapse/storage/databases/main/event_federation.py
  6. You can now start synapse.

I did this on Debian 11 with the official Debian Synapse package 1.50.1

Redmauss commented 2 years ago

@Cknight70 worked like a charm! I am now using 245mb of RAM instead of over 3GB! My messages send reliably now and there are no connection dropouts. Thanks man, your writeup was super helpful.

I am using Ubuntu Server 20.04, and I installed the official matrix-synapse-py3 package. Steps were the exact same except it was Python version 3.8 instead of 3.9 for me.

anoadragon453 commented 2 years ago

Fixed in https://github.com/matrix-org/synapse/pull/11806. The fix will be released in Synapse v1.50.2 and v1.51.0rc2.

Thank you all for testing the patch :slightly_smiling_face: