Closed n-peugnet closed 11 months ago
Since the upgrade to Synapse 1.95 and 1.95.1
What version of Synapse did you upgrade from?
That particular error comes while backfilling events and comes from:
It looks like all those events are from the room !VXimQkuRxeRUFhboju:anontier.nl
. I'm slightly suspicious that you have the retention policies feature enabled as that has been known to be buggy in the past. Have you had it enabled for a long time?
Since the upgrade to Synapse 1.95 and 1.95.1
What version of Synapse did you upgrade from?
I upgraded from 1.92.3 to 1.95.0 then later to 1.95.1
It looks like all those events are from the room
!VXimQkuRxeRUFhboju:anontier.nl
.
How did you know which room these events are related to? After a quick look, I won't be sad if I have to purge this room from the server.
I'm slightly suspicious that you have the retention policies feature enabled as that has been known to be buggy in the past. Have you had it enabled for a long time?
I have it enabled for a long time yes, since Tue Aug 10 18:00:55 2021 exactly. And I was also afraid that it might be related.
It looks like all those events are from the room
!VXimQkuRxeRUFhboju:anontier.nl
.How did you know which room these events are related to? After a quick look, I won't be sad if I have to purge this room from the server.
Event IDs are globally unique, so I did:
SELECT room_id FROM events WHERE event_id = '...';
I'm slightly suspicious that you have the retention policies feature enabled as that has been known to be buggy in the past. Have you had it enabled for a long time?
I have it enabled for a long time yes, since Tue Aug 10 18:00:55 2021 exactly. And I was also afraid that it might be related.
There were known bugs in the feature until Synapse ~1.94, so I think this might be the cause. You could purge the room entirely and then rejoin it. I know that's not an ideal solution though.
How did you know which room these events are related to? After a quick look, I won't be sad if I have to purge this room from the server.
Event IDs are globally unique, so I did:
SELECT room_id FROM events WHERE event_id = '...';
But if the event has been dropped as the log suggests, can I still try this query on my own homeserver?
I have it enabled for a long time yes, since Tue Aug 10 18:00:55 2021 exactly. And I was also afraid that it might be related.
There were known bugs in the feature until Synapse ~1.94, so I think this might be the cause.
That's what I saw by following a little bit the project on GitHub.
You could purge the room entirely and then rejoin it. I know that's not an ideal solution though.
Thank you for your advice, I just purged the room. We only had a single user from club1.fr in this room so that's not a big loss. I will keep monitoring the server and the logs to see if this problem arise again, but until then I will close this issue.
Feel free to shout if you see it come back!
Description
Since the upgrade to Synapse 1.95 and 1.95.1 we some time have around 20minutes of unresponsiveness, due to the CPU usage going up to 100%. We have a small instance no workers and have enabled retention policy.
In the logs we have a LOT of these messages when this happens (thousands):
Not only from matrix.org, but from a lot of different servers.
Steps to reproduce
Sorry I really don't know what to add in my case.
Homeserver
club1.fr
Synapse Version
1.95.1
Installation Method
Debian packages from packages.matrix.org
Database
PostgreSQL single server
Workers
Single process
Platform
Configuration
I just disabled presence but it was enabled until now.
Retention policy is ON:
Relevant log output
Anything else that would be useful to know?
(the purple one is
synapse- _process_new_pulled_events_with_failed_pull_attempts
)