element-hq / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://element-hq.github.io/synapse
GNU Affero General Public License v3.0
1.3k stars 157 forks source link

After updating the Synapse server to version 1.109.0~rc1, new messages do not leave / arrive. #17274

Closed ELForcer closed 2 months ago

ELForcer commented 3 months ago

Description

I updated the package using apt, after which after a while I noticed high CPU consumption for Synapse processes, and then that messages were not actually sent to the recipients. Restarting the service didn't help. At the moment, rolling back to package version 1.108.0 helped.

Steps to reproduce

sudo apt update sudo apt upgrade

Homeserver

homeserver

Synapse Version

1.109.0rc1

Installation Method

Debian packages from packages.matrix.org

Database

PostgreSQL 16

Workers

I don't know

Platform

System:
  Host: userver Kernel: 6.1.0-21-amd64 arch: x86_64 bits: 64 Console: pty pts/8 Distro: Debian
    GNU/Linux 12 (bookworm)
CPU:
  Info: quad core model: Intel Xeon E5620 bits: 64 type: MT MCP cache: L2: 1024 KiB
  Speed (MHz): avg: 1600 min/max: 1600/2400 cores: 1: 1600 2: 1600 3: 1600 4: 1600 5: 1600
    6: 1600 7: 1600 8: 1600
RAID:
  Hardware-1: Hewlett-Packard Smart Array G6 controllers driver: hpsa
Drives:
  Local Storage: total: 476.89 GiB used: 287.85 GiB (60.4%)
  ID-1: /dev/sda model: LOGICAL VOLUME size: 476.89 GiB
Partition:
  ID-1: / size: 72.78 GiB used: 24.99 GiB (34.3%) fs: ext4 dev: /dev/sda1
  ID-2: /home size: 365.61 GiB used: 258.44 GiB (70.7%) fs: ext4 dev: /dev/sda5
  ID-3: /var/log size: 29.21 GiB used: 4.42 GiB (15.1%) fs: ext4 dev: /dev/sda6
Swap:
  Alert: No swap data was found.
Info:
  Processes: 406 Uptime: 27d 3h 41m Memory: 39.16 GiB used: 6.16 GiB (15.7%) Init: systemd
  target: graphical (5) Shell: Bash inxi: 3.3.26

Configuration

No response

Relevant log output

2024-06-06 15:38:43,354 - root - 355 - WARNING - main - ***** STARTING SERVER *****
2024-06-06 15:38:43,354 - root - 356 - WARNING - main - Server /opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/app/homeserver.py version 1.109.0rc1
2024-06-06 15:38:43,354 - root - 361 - WARNING - main - Copyright (c) 2023 New Vector, Inc
2024-06-06 15:38:43,354 - root - 362 - WARNING - main - Licensed under the AGPL 3.0 license. Website: https://github.com/element-hq/synapse
2024-06-06 15:38:43,873 - synapse.metrics.background_process_metrics - 253 - ERROR - _handle_new_device_update_async-0 - Background process '_handle_new_device_update_async' threw an exception
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/metrics/background_process_metrics.py", line 251, in run
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/handlers/device.py", line 873, in _handle_new_device_update_async
    assert max_stream_id == stream_id
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
2024-06-06 15:38:54,016 - synapse.handlers.sync - 523 - WARNING - GET-2 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:54,375 - synapse.handlers.sync - 523 - WARNING - GET-7 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:54,378 - synapse.handlers.sync - 523 - WARNING - GET-8 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:54,628 - synapse.handlers.sync - 523 - WARNING - GET-11 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:54,773 - synapse.handlers.sync - 523 - WARNING - GET-15 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:54,806 - synapse.handlers.sync - 523 - WARNING - GET-17 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:55,049 - synapse.handlers.sync - 523 - WARNING - GET-20 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:55,341 - synapse.handlers.sync - 523 - WARNING - GET-26 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:55,348 - synapse.handlers.sync - 523 - WARNING - GET-27 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:55,621 - synapse.handlers.sync - 523 - WARNING - GET-30 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:55,755 - synapse.handlers.sync - 523 - WARNING - GET-35 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:55,781 - synapse.handlers.sync - 523 - WARNING - GET-36 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:55,893 - synapse.handlers.sync - 523 - WARNING - GET-38 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:56,025 - synapse.handlers.sync - 523 - WARNING - GET-41 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:57,191 - synapse.handlers.sync - 523 - WARNING - GET-44 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:57,535 - synapse.handlers.sync - 523 - WARNING - GET-57 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:57,637 - synapse.handlers.sync - 523 - WARNING - GET-59 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:58,015 - synapse.handlers.sync - 523 - WARNING - GET-62 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:58,651 - synapse.handlers.sync - 523 - WARNING - GET-67 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:58,820 - synapse.handlers.sync - 523 - WARNING - GET-70 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:59,159 - synapse.handlers.sync - 523 - WARNING - GET-73 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:59,321 - synapse.handlers.sync - 523 - WARNING - GET-76 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:59,816 - synapse.handlers.sync - 523 - WARNING - GET-81 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:59,885 - synapse.handlers.sync - 523 - WARNING - GET-83 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:38:59,953 - synapse.handlers.sync - 523 - WARNING - GET-85 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:00,113 - synapse.handlers.sync - 523 - WARNING - GET-88 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:00,190 - synapse.handlers.sync - 523 - WARNING - GET-92 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:00,314 - synapse.handlers.sync - 523 - WARNING - GET-94 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:00,570 - synapse.handlers.sync - 523 - WARNING - GET-97 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:01,126 - synapse.handlers.sync - 523 - WARNING - GET-100 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:01,308 - synapse.handlers.sync - 523 - WARNING - GET-103 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:02,060 - synapse.handlers.sync - 523 - WARNING - GET-111 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:02,786 - synapse.handlers.sync - 523 - WARNING - GET-118 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:02,971 - synapse.handlers.sync - 523 - WARNING - GET-121 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:04,157 - synapse.handlers.sync - 523 - WARNING - GET-124 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:04,496 - synapse.handlers.sync - 523 - WARNING - GET-129 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:04,509 - synapse.handlers.sync - 523 - WARNING - GET-131 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:04,836 - synapse.handlers.sync - 523 - WARNING - GET-138 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:04,844 - synapse.handlers.sync - 523 - WARNING - GET-139 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:04,855 - synapse.handlers.sync - 523 - WARNING - GET-143 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:05,172 - synapse.handlers.sync - 523 - WARNING - GET-150 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:05,233 - synapse.handlers.sync - 523 - WARNING - GET-134 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:05,498 - synapse.handlers.sync - 523 - WARNING - GET-156 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:05,499 - synapse.handlers.sync - 523 - WARNING - GET-158 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:05,815 - synapse.handlers.sync - 523 - WARNING - GET-163 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:05,851 - synapse.handlers.sync - 523 - WARNING - GET-166 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:06,045 - synapse.handlers.sync - 523 - WARNING - GET-173 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:06,061 - synapse.handlers.sync - 523 - WARNING - GET-174 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:06,156 - synapse.handlers.sync - 523 - WARNING - GET-177 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:07,239 - synapse.handlers.sync - 523 - WARNING - GET-180 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:07,603 - synapse.handlers.sync - 523 - WARNING - GET-184 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:07,892 - synapse.handlers.sync - 523 - WARNING - GET-188 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:08,179 - synapse.handlers.sync - 523 - WARNING - GET-191 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:08,695 - synapse.handlers.sync - 523 - WARNING - GET-194 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:08,860 - synapse.handlers.sync - 523 - WARNING - GET-198 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:09,287 - synapse.handlers.sync - 523 - WARNING - GET-203 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:09,462 - synapse.handlers.sync - 523 - WARNING - GET-207 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:09,931 - synapse.handlers.sync - 523 - WARNING - GET-210 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:10,025 - synapse.handlers.sync - 523 - WARNING - GET-212 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:10,127 - synapse.handlers.sync - 523 - WARNING - GET-214 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:10,233 - synapse.handlers.sync - 523 - WARNING - GET-218 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:10,256 - synapse.handlers.sync - 523 - WARNING - GET-221 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:10,517 - synapse.handlers.sync - 523 - WARNING - GET-224 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:10,705 - synapse.handlers.sync - 523 - WARNING - GET-227 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:11,311 - synapse.handlers.sync - 523 - WARNING - GET-233 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:11,413 - synapse.handlers.sync - 523 - WARNING - GET-236 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:12,225 - synapse.handlers.sync - 523 - WARNING - GET-238 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:12,846 - synapse.handlers.sync - 523 - WARNING - GET-240 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:13,089 - synapse.handlers.sync - 523 - WARNING - GET-242 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:39:15,318 - synapse.handlers.sync - 523 - WARNING - GET-399 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:14,381 - synapse.handlers.sync - 523 - WARNING - GET-78123 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:14,613 - synapse.handlers.sync - 523 - WARNING - GET-78271 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:14,616 - synapse.handlers.sync - 523 - WARNING - GET-78272 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:14,927 - synapse.handlers.sync - 523 - WARNING - GET-78455 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:14,981 - synapse.handlers.sync - 523 - WARNING - GET-78493 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:15,052 - synapse.handlers.sync - 523 - WARNING - GET-78526 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:15,310 - synapse.handlers.sync - 523 - WARNING - GET-78681 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:15,652 - synapse.handlers.sync - 523 - WARNING - GET-78883 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:15,691 - synapse.handlers.sync - 523 - WARNING - GET-78912 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:15,928 - synapse.handlers.sync - 523 - WARNING - GET-79049 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:15,966 - synapse.handlers.sync - 523 - WARNING - GET-79072 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:16,157 - synapse.handlers.sync - 523 - WARNING - GET-79185 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:16,238 - synapse.handlers.sync - 523 - WARNING - GET-79225 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:16,302 - synapse.handlers.sync - 523 - WARNING - GET-79262 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:17,284 - synapse.handlers.sync - 523 - WARNING - GET-79868 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:17,643 - synapse.handlers.sync - 523 - WARNING - GET-80090 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:18,043 - synapse.handlers.sync - 523 - WARNING - GET-80326 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:18,225 - synapse.handlers.sync - 523 - WARNING - GET-80440 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:18,752 - synapse.handlers.sync - 523 - WARNING - GET-80750 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:18,936 - synapse.handlers.sync - 523 - WARNING - GET-80855 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:19,484 - synapse.handlers.sync - 523 - WARNING - GET-81182 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:19,583 - synapse.handlers.sync - 523 - WARNING - GET-81242 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:19,989 - synapse.handlers.sync - 523 - WARNING - GET-81497 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:20,193 - synapse.handlers.sync - 523 - WARNING - GET-81603 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:20,193 - synapse.handlers.sync - 523 - WARNING - GET-81617 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:20,274 - synapse.handlers.sync - 523 - WARNING - GET-81666 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:20,385 - synapse.handlers.sync - 523 - WARNING - GET-81705 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:20,714 - synapse.handlers.sync - 523 - WARNING - GET-81843 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:20,837 - synapse.handlers.sync - 523 - WARNING - GET-81885 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:21,393 - synapse.handlers.sync - 523 - WARNING - GET-82096 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:21,518 - synapse.handlers.sync - 523 - WARNING - GET-82131 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:22,338 - synapse.handlers.sync - 523 - WARNING - GET-82375 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:22,979 - synapse.handlers.sync - 523 - WARNING - GET-82557 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:23,224 - synapse.handlers.sync - 523 - WARNING - GET-82584 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:41:25,447 - synapse.handlers.sync - 523 - WARNING - GET-82894 - Timed out waiting for worker to catch up. Returning empty response
2024-06-06 15:43:24,558 - synapse.handlers.sync - 523 - WARNING - GET-153624 - Timed out waiting for worker to catch up. Returning empty response
...

Anything else that would be useful to know?

No response

anoadragon453 commented 3 months ago

Hi @ELForcer, thanks for your report.

Do you have any workers configured? If you don't know, it's likely you don't.

For developers, here's the relevant bit of the code:

https://github.com/element-hq/synapse/blob/b71d2774388c90a68d71dd8d805556c8f62c92a1/synapse/handlers/device.py#L863-L874

https://github.com/element-hq/synapse/pull/17211 may be a possible contender for the regression, as it's only been introduced in v1.109.0rc1 as touches nearby code (outbound device pokes).

anoadragon453 commented 3 months ago

@wrjlewis mentioned that a possible workaround is to run:

SELECT setval('device_lists_sequence', (
    SELECT GREATEST(MAX(stream_id), 0) FROM device_lists_stream
));

And then start the server on v1.109.0rc1. However the code should still be fixed.

Edit: See the below comment for warnings on running this query.

erikjohnston commented 3 months ago

@anoadragon453 That SQL should only be run if Synapse prompts you to do so, and while Synapse is offline. Moving streams can be quite dangerous if you get it wrong.

I think this an edge case with #17229, where we didn't include device_lists_changes_converted_stream_position table in the new stream ID generator. I think the easiest fix is to just skip the assert.

erikjohnston commented 3 months ago

Actually, I've changed tack, but PR up at https://github.com/element-hq/synapse/pull/17292

sandhose commented 3 months ago

@ELForcer This should have been fixed in v1.109.0-rc2. Could you try upgrading and report back?

ELForcer commented 3 months ago

Hello. It's gotten worse at the moment. The matrix-synapse service no longer starts.

user@userver:/var/log/matrix-synapse$ systemctl status matrix-synapse
● matrix-synapse.service - Synapse Matrix homeserver
     Loaded: loaded (/lib/systemd/system/matrix-synapse.service; enabled; preset: enabled)
     Active: activating (auto-restart) (Result: exit-code) since Tue 2024-06-11 21:44:12 +04; 2s ago
    Process: 428001 ExecStartPre=/opt/venvs/matrix-synapse/bin/python -m synapse.app.homeserver --config-path=/etc/matrix-synapse/homeserver.yaml --config-path=/etc/matrix-synapse/conf.d/ --generate-keys (code=exited, status=0/SUCCESS)
    Process: 428006 ExecStart=/opt/venvs/matrix-synapse/bin/python -m synapse.app.homeserver --config-path=/etc/matrix-synapse/homeserver.yaml --config-path=/etc/matrix-synapse/conf.d/ (code=exited, status=1/FAILURE)
   Main PID: 428006 (code=exited, status=1/FAILURE)
        CPU: 4.496s

Log:

2024-06-11 21:43:09,382 - root - 355 - WARNING - main - ***** STARTING SERVER *****
2024-06-11 21:43:09,382 - root - 356 - WARNING - main - Server /opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/app/homeserver.py version 1.109.0rc2
2024-06-11 21:43:09,383 - root - 361 - WARNING - main - Copyright (c) 2023 New Vector, Inc
2024-06-11 21:43:09,383 - root - 362 - WARNING - main - Licensed under the AGPL 3.0 license. Website: https://github.com/element-hq/synapse
2024-06-11 21:43:09,638 - synapse.storage.util.sequence - 180 - WARNING - main - Postgres sequence push_rules_stream_sequence is behind table push_rules_stream: 73 < 77
2024-06-11 21:43:09,639 - synapse.app._base - 220 - ERROR - main - Exception during startup
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/app/homeserver.py", line 359, in setup
    hs.setup()
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/server.py", line 348, in setup
    self.datastores = Databases(self.DATASTORE_CLASS, self)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/__init__.py", line 100, in __init__
    main = main_store_class(database, db_conn, hs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/__init__.py", line 170, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/events_bg_updates.py", line 107, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/experimental_features.py", line 40, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/devices.py", line 1747, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/devices.py", line 95, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/roommember.py", line 1549, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/room.py", line 2207, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/room.py", line 1750, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/room.py", line 150, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/registration.py", line 2327, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/profile.py", line 44, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/presence.py", line 83, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/presence.py", line 64, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/transactions.py", line 81, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/state.py", line 751, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/state.py", line 109, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/state.py", line 549, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/event_federation.py", line 2064, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/media_repository.py", line 199, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/media_repository.py", line 112, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/filtering.py", line 49, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/event_push_actions.py", line 1896, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/metrics.py", line 75, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/event_push_actions.py", line 263, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/end_to_end_keys.py", line 1449, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/e2e_room_keys.py", line 60, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/search.py", line 435, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/search.py", line 144, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/account_data.py", line 68, in __init__
    super().__init__(database, db_conn, hs)
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/databases/main/push_rule.py", line 143, in __init__
    self._push_rules_stream_id_gen = MultiWriterIdGenerator(
                                     ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/util/id_generators.py", line 298, in __init__
    self._sequence_gen.check_consistency(
  File "/opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/storage/util/sequence.py", line 187, in check_consistency
    raise IncorrectDatabaseSetup(
synapse.storage.engines._base.IncorrectDatabaseSetup: 
Postgres sequence 'push_rules_stream_sequence' is inconsistent with associated
table 'push_rules_stream'. This can happen if Synapse has been downgraded and
then upgraded again, or due to a bad migration.

To fix this error, shut down Synapse (including any and all workers)
and run the following SQL:

    SELECT setval('push_rules_stream_sequence', (
        SELECT GREATEST(MAX(stream_id), 0) FROM push_rules_stream
    ));

See docs/postgres.md for more information.

My understanding is that it cannot automatically perform the operation:

 SELECT setval('push_rules_stream_sequence', (
 SELECT GREATEST(MAX(stream_id), 0) FROM push_rules_stream
 ));

and this SQL query needs to be executed independently?

ELForcer commented 3 months ago

At the moment I have rolled back to package version 1.108.0.

frebib commented 3 months ago

+1, I am also hitting this Postgres sequence 'push_rules_stream_sequence' is inconsistent with associated table 'push_rules_stream'. This can happen if Synapse has been downgraded and then upgraded again, or due to a bad migration. error now after rolling between 1.108.0, 1.109.0rc1 and 1.109.0rc2 when I noticed that sync'ing stopped working after upgrading to 1.109.0rc2. Also have rolled back to 1.108.0 for now without running the query until we hear anything about potential next steps

sandhose commented 3 months ago

I was able to reproduce locally. Steps to reproduce:

Database looks like this:

synapse=# select stream_id, event_stream_ordering, instance_name from push_rules_stream;
 stream_id | event_stream_ordering | instance_name
-----------+-----------------------+---------------
         2 |                    10 |
         3 |                    12 |
         4 |                    13 | master
         5 |                    14 |

synapse=# select instance_name, stream_id from stream_positions where stream_name = 'push_rules_stream';
 instance_name | stream_id
---------------+-----------
 master        |         4

Even simpler to trigger:

sandhose commented 3 months ago

I believe we fixed this, debs for v1.109.0rc3 with the fix should be available in a few hours. If it indeed fixes your issue, we should be able to release v1.109.0 tomorrow

frebib commented 3 months ago

rc3 works fine for me. Sync seems to work fine, too. Not sure what the weirdness I saw with rc1/rc2 was about. Thanks :)

ELForcer commented 3 months ago

I confirm. Upgrading to 1.109.0rc3 no longer crashes the homeserver. Messages reach the recipient. There are no more errors in the Log.

2024-06-18 10:12:28,298 - root - 355 - WARNING - main - ***** STARTING SERVER *****
2024-06-18 10:12:28,298 - root - 356 - WARNING - main - Server /opt/venvs/matrix-synapse/lib/python3.11/site-packages/synapse/app/homeserver.py version 1.109.0rc3
2024-06-18 10:12:28,298 - root - 361 - WARNING - main - Copyright (c) 2023 New Vector, Inc
2024-06-18 10:12:28,299 - root - 362 - WARNING - main - Licensed under the AGPL 3.0 license. Website: https://github.com/element-hq/synapse
2024-06-18 10:12:28,587 - synapse.storage.util.sequence - 167 - WARNING - main - Postgres sequence push_rules_stream_sequence is behind table push_rules_stream: 73 < 89. Updating sequence.
2024-06-18 10:12:28,602 - synapse.storage.util.sequence - 167 - WARNING - main - Postgres sequence e2e_cross_signing_keys_sequence is behind table e2e_cross_signing_keys: 911 < 915. Updating sequence.
2024-06-18 10:12:28,716 - synapse.storage.util.sequence - 167 - WARNING - main - Postgres sequence device_lists_sequence is behind table device_lists_stream: 3598 < 3621. Updating sequence.
ELForcer commented 3 months ago

The task can be considered closed. Thank you all.

djmaze commented 2 months ago

Just upgraded from 104.0 to 109.0 via the docker image (or rather a derived image with a simple change which installs the synapse-s3-storage-provider package as well). No workers are configured. Shortly after starting, the log fills up with:

synapse.handlers.sync - 527 - WARNING - GET-1863- Timed out waiting for worker to catch up. Returning empty response

It seems there are no other related errors in the logs.

Afterwards, Element-X (on Android) does not sync any changes anymore, same with the nheko desktop client. Element web does seem to work correctly though.

For now, I downgraded to 108.0, which seems to fix all these problems, and synapse does not report the error anymore.

ananace commented 2 months ago

Ran into a very similar issue on updating to 109.0 via docker image as well, with messages delivery freezing entirely. This with only a federation-sender worker and one worker that handles the summary MSC endpoint.

Log contained;

synapse.handlers.sync - 527 - WARNING - GET-24096- Timed out waiting for worker to catch up. Returning empty response
...
synapse.notifier - 780 - INFO - GET-24113- Waiting for current token to reach StreamToken(room_key=RoomStreamToken(stream=1314143, instance_map=immutabledict({}), topological=None), presence_key=61707880, typing_key=8188, receipt_key=MultiWriterStreamToken(stream=3056248, instance_map=immutabledict({})), account_data_key=252093, push_rules_key=267, to_device_key=65220, device_list_key=7269565, groups_key=0, un_partial_stated_rooms_key=620); currently at StreamToken(room_key=RoomStreamToken(stream=1314372, instance_map=immutabledict({}), topological=None), presence_key=61725417, typing_key=165, receipt_key=MultiWriterStreamToken(stream=3056824, instance_map=immutabledict({})), account_data_key=252242, push_rules_key=267, to_device_key=65277, device_list_key=7272052, groups_key=0, un_partial_stated_rooms_key=588)

With the "Waiting for current token to reach StreamToken(...)" message being spammed rapidly.

In digging through it and the database, I found that the un_partial_stated_room_stream_sequence was at 588 - and also only had data up until 588 in the stream, while the stream tokens that were trying to be reached wanted it to be at 620. Stopping Synapse, updating the sequence manually, and then starting it up again seems to have resolved the pressing issue.

E.g.

SELECT setval('un_partial_stated_room_stream_sequence', 620);
baconsalad commented 2 months ago

I have the same issue. I only upgrade between full releases (no rc) and have been doing so for a while.

2024-06-25 13:51:16,302 - synapse.notifier - 780 - INFO - GET-147 - Waiting for current token to reach StreamToken(room_key=RoomStreamToken(stream=6450, instance_map=immutabledict({}), topological=None), presence_key=143808, typing_key=4, receipt_key=MultiWriterStreamToken(stream=7986, instance_map=immutabledict({})), account_data_key=2248, push_rules_key=27, to_device_key=19021, device_list_key=1122, groups_key=0, un_partial_stated_rooms_key=1); currently at StreamToken(room_key=RoomStreamToken(stream=6708, instance_map=immutabledict({}), topological=None), presence_key=144401, typing_key=0, receipt_key=MultiWriterStreamToken(stream=8026, instance_map=immutabledict({})), account_data_key=2274, push_rules_key=27, to_device_key=19012, device_list_key=1136, groups_key=0, un_partial_stated_rooms_key=1)

2024-06-25 13:51:16,804 - synapse.handlers.sync - 527 - WARNING - GET-147 - Timed out waiting for worker to catch up. Returning empty response

The server is not federated and un_partial_stated_rooms_key=1 is the same. And don't know if this matters but I don't use systemd.

ananace commented 2 months ago

@baconsalad Your log seems to list two keys out of phase; waiting for .. typing_key=4, to_device_key=19021, .. currently at .. typing_key=0, to_device_key=19012,

djmaze commented 2 months ago

I might have a reason & solution for my problem. After a recent reconfiguration, synapse was unable to reach my unified push server, leaving messages like this in the log:

synapse.push.httppusher - 416 - WARNING - httppush.process-793- Failed to push data to @REDACTED/im.vector.app.android/https://REDACTED/REDACTED?up=1: <class 'synapse.api.errors.HttpResponseException'> 504: Gateway Timeout

With versions < 1.109.0, this did not block messages from being forwarded to the client (apart from the push not working reliably). With 1.109.0, this probably does not work anymore, maybe because of https://github.com/element-hq/synapse/pull/17240? (Update: No, that's probably not the right PR.)

Just a theory for now. I fixed my configuration and will try the update again soon.

erikjohnston commented 2 months ago

I think https://github.com/element-hq/synapse/pull/17386 should fix this :crossed_fingers:

samcday commented 1 month ago

I originally encountered this when upgrading from v1.108.0 to v1.109.0, forcing me to rollback to v1.108.0. I then upgraded to v1.110.0 when it was released, and the issue seemed to be resolved. However, I've just now discovered my homeserver is stuck with this issue again on v1.111.0:

Details

``` Starting synapse with args -m synapse.app.homeserver --config-path /config/homeserver.yaml 2024-07-24 16:51:23,537 - root - 355 - WARNING - main - ***** STARTING SERVER ***** 2024-07-24 16:51:23,537 - root - 356 - WARNING - main - Server /usr/local/lib/python3.11/site-packages/synapse/app/homeserver.py version 1.111.0 2024-07-24 16:51:23,537 - root - 361 - WARNING - main - Copyright (c) 2023 New Vector, Inc 2024-07-24 16:51:23,537 - root - 362 - WARNING - main - Licensed under the AGPL 3.0 license. Website: https://github.com/element-hq/synapse 2024-07-24 16:51:23,537 - root - 365 - INFO - main - Server hostname: samcday.com 2024-07-24 16:51:23,537 - root - 366 - INFO - main - Instance name: master 2024-07-24 16:51:23,537 - root - 367 - INFO - main - Twisted reactor: EPollReactor 2024-07-24 16:51:23,537 - synapse.app.homeserver - 356 - INFO - main - Setting up server 2024-07-24 16:51:23,537 - synapse.server - 351 - INFO - main - Setting up. 2024-07-24 16:51:23,598 - synapse.storage.databases - 73 - INFO - main - [database config 'master']: Checking database server 2024-07-24 16:51:23,611 - synapse.storage.databases - 76 - INFO - main - [database config 'master']: Preparing for databases ['main', 'state'] 2024-07-24 16:51:23,612 - synapse.storage.prepare_database - 135 - INFO - main - ['main', 'state']: Checking existing schema version 2024-07-24 16:51:23,624 - synapse.storage.prepare_database - 139 - INFO - main - ['main', 'state']: Existing schema is 85 (+6 deltas) 2024-07-24 16:51:23,624 - synapse.storage.databases.main - 391 - INFO - main - Checking database for consistency with configuration... 2024-07-24 16:51:23,626 - synapse.storage.prepare_database - 433 - INFO - main - Applying schema deltas for v85 2024-07-24 16:51:23,631 - synapse.storage.prepare_database - 568 - INFO - main - Schema now up to date 2024-07-24 16:51:23,645 - synapse.storage.databases - 91 - INFO - main - [database config 'master']: Starting 'main' database -snip- 2024-07-24 16:51:23,930 - synapse.app._base - 220 - ERROR - main - Exception during startup Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/synapse/app/homeserver.py", line 359, in setup hs.setup() File "/usr/local/lib/python3.11/site-packages/synapse/server.py", line 353, in setup self.datastores = Databases(self.DATASTORE_CLASS, self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/__init__.py", line 100, in __init__ main = main_store_class(database, db_conn, hs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/__init__.py", line 170, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/events_bg_updates.py", line 107, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/experimental_features.py", line 44, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/devices.py", line 1772, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/devices.py", line 95, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/roommember.py", line 1504, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/room.py", line 2229, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/room.py", line 1772, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/room.py", line 146, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/registration.py", line 2327, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/profile.py", line 44, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/presence.py", line 80, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/presence.py", line 61, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/transactions.py", line 81, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/state.py", line 751, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/state.py", line 109, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/state.py", line 549, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/event_federation.py", line 2072, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/media_repository.py", line 199, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/media_repository.py", line 112, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/filtering.py", line 49, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/event_push_actions.py", line 1896, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/metrics.py", line 75, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/event_push_actions.py", line 263, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/end_to_end_keys.py", line 1449, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/e2e_room_keys.py", line 60, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/search.py", line 435, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/search.py", line 144, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/account_data.py", line 65, in __init__ super().__init__(database, db_conn, hs) File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/push_rule.py", line 143, in __init__ self._push_rules_stream_id_gen = MultiWriterIdGenerator( ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/synapse/storage/util/id_generators.py", line 295, in __init__ self._sequence_gen.check_consistency( File "/usr/local/lib/python3.11/site-packages/synapse/storage/util/sequence.py", line 192, in check_consistency raise IncorrectDatabaseSetup( synapse.storage.engines._base.IncorrectDatabaseSetup: Postgres sequence 'push_rules_stream_sequence' is inconsistent with associated stream position of 'push_rules_stream' in the 'stream_positions' table. This is likely a programming error and should be reported at https://github.com/matrix-org/synapse. A temporary workaround to fix this error is to shut down Synapse (including any and all workers) and run the following SQL: DELETE FROM stream_positions WHERE stream_name = 'push_rules_stream'; This will need to be done every time the server is restarted. ```