matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.13k forks source link

Synapse loop restart #16464

Closed aukfood closed 1 year ago

aukfood commented 1 year ago

Description

Hello, my docker container of synapse loop to restart with thoses errors

2023-10-11 12:56:41,686 - synapse.handlers.presence - 878 - INFO - presence.on_shutdown-0 - Performing _on_shutdown. Persisting 0 unpersisted changes
2023-10-11 12:56:41,686 - synapse.handlers.presence - 890 - INFO - presence.on_shutdown-0 - Finished _on_shutdown
2023-10-11 12:56:41,686 - synapse.app._base - 496 - INFO - sentinel - Shutting down...
2023-10-11 12:56:41,687 - synapse.replication.tcp.redis - 299 - INFO - sentinel - Connection to redis server redis:6379 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.
2023-10-11 12:56:41,687 - synapse.replication.tcp.redis - 197 - INFO - sentinel - Lost connection to redis
2023-10-11 12:56:41,687 - synapse.replication.tcp.redis - 299 - INFO - sentinel - Connection to redis server redis:6379 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.
2023-10-11 12:56:41,698 - twisted - 275 - INFO - sentinel - Main loop terminated.

Steps to reproduce

Start docker-compose : docker stack deploy rc01 --compose-file docker-compose.yml

Homeserver

homeserver

Synapse Version

1.94.0

Installation Method

Docker (matrixdotorg/synapse)

Database

Postgresql

Workers

Single process

Platform

Linux Debian 12

Configuration

homeserver.yaml

iapp_service_config_files:
  - /registration.yaml

retention:
  enabled: true

enable_registration: true
enable_registration_without_verification: true

server_name: url_name
report_stats: true
pid_file: /data/homeserver.pid
log_config: "/data/url_name.log.config"
media_store_path: /data/media_store
macaroon_secret_key: *********

trusted_key_servers:
  - server_name: "matrix.org"
suppress_key_server_warning: true

database:
  name: psycopg2
  args:
    user: matrix
    password: ******
    database: ********
    host: postgres
    cp_min: 5
    cp_max: 10

allow_public_rooms_without_auth: true
allow_public_rooms_over_federation: true

redis:
  enabled: true
  host: redis
  port: 6379

federation_custom_ca_list:

Relevant log output

2023-10-11 12:56:41,685 - twisted - 275 - INFO - sentinel - Received SIGTERM, shutting down.
2023-10-11 12:56:41,686 - synapse.storage.databases.main.lock - 106 - INFO - LockStore._on_shutdown-0 - Dropping held locks due to shutdown
2023-10-11 12:56:41,686 - synapse.storage.databases.main.lock - 117 - INFO - LockStore._on_shutdown-0 - Dropped locks due to shutdown
2023-10-11 12:56:41,686 - synapse.handlers.presence - 878 - INFO - presence.on_shutdown-0 - Performing _on_shutdown. Persisting 0 unpersisted changes
2023-10-11 12:56:41,686 - synapse.handlers.presence - 890 - INFO - presence.on_shutdown-0 - Finished _on_shutdown
2023-10-11 12:56:41,686 - synapse.app._base - 496 - INFO - sentinel - Shutting down...
2023-10-11 12:56:41,687 - synapse.replication.tcp.redis - 299 - INFO - sentinel - Connection to redis server redis:6379 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.
2023-10-11 12:56:41,687 - synapse.replication.tcp.redis - 197 - INFO - sentinel - Lost connection to redis
2023-10-11 12:56:41,687 - synapse.replication.tcp.redis - 299 - INFO - sentinel - Connection to redis server redis:6379 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.
2023-10-11 12:56:41,698 - twisted - 275 - INFO - sentinel - Main loop terminated.

Anything else that would be useful to know?

docker-compose part

  matrix:
    image: matrixdotorg/synapse:v1.94.0
    environment:
      SYANPSE_SERVER_NAME: matrix_url
    deploy:
      labels:
        - traefik.enable=true
        - traefik.docker.network=traefik-public
        - traefik.constraint-label=traefik-public
        - traefik.http.routers.rc01-matrix-http.rule=Host(`matrix_url`)
        - traefik.http.routers.rc01-matrix-http.entrypoints=http
        - traefik.http.routers.rc01-matrix-http.middlewares=https-redirect
        - traefik.http.routers.rc01-matrix-https.rule=Host(`matrix_url`)
        - traefik.http.routers.rc01-matrix-https.entrypoints=https
        - traefik.http.routers.rc01-matrix-https.tls=true
        - traefik.http.routers.rc01-matrix-https.tls.certresolver=le
        - traefik.http.services.rc01-matrix.loadbalancer.server.port=8008
        - traefik.http.services.rc01-matrix.loadbalancer.server.scheme=http
    depends_on:
      - postgres
    volumes:
      - matrix_data_rc01:/data
    networks:
      - traefik-public
      - matrix-rc01
      - redis-rc01

  postgres:
    image: postgres:14
    deploy:
      labels:
        - traefik.enable=false
    environment:
      POSTGRES_USER: matrix
      POSTGRES_PASSWORD: PoiuytrezA
      POSTGRES_DB: rocketchat
      POSTGRES_INITDB_ARGS: "--encoding='UTF8' --lc-collate='C' --lc-ctype='C'"
    volumes:
      - postgres_data_rc01:/var/lib/postgresql/data
    networks:
      - matrix-rc01

  redis:
    image: redis
    deploy:
      labels:
        - traefik.enable=false
    networks:
      - redis-rc01
DMRobertson commented 1 year ago

2023-10-11 12:56:41,685 - twisted - 275 - INFO - sentinel - Received SIGTERM, shutting down.

Something told Synapse to shut down, so it did.

You should inspect your docker-compose logs, and maybe even the docker service logs to see if there any clues. Can you see what is shutting Synapse down and why?

clokep commented 1 year ago

(Closing as this is external to Synapse.)

clokep commented 1 year ago

@DMRobertson notes that this could be due to #14401. What was Synapse logging before this?

aukfood commented 1 year ago

@DMRobertson I have 5 services : rocketchat / mongod / postgres / redis / synpase

So if i do docker service logs -f xxx_servicename I have only those logs

synapse :

rc01_matrix.1.vnflimqgal07@swarm01    | Starting synapse with args -m synapse.app.homeserver --config-path /data/homeserver.yaml

mongod :

rc01_mongodb.1.6m9lgw20du5f@swarm02    | {"t":{"$date":"2023-10-11T14:58:44.083+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"Checkpointer","msg":"WiredTiger message","attr":{"message":"[1697036324:83214][1:0x7f7d133bf700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 3017, snapshot max: 3017 snapshot count: 0, oldest timestamp: (1697036018, 1) , meta checkpoint timestamp: (1697036318, 1) base write gen: 42009"}}

@clokep all homeserver.log

2023-10-11 14:58:27,866 - root - 348 - WARNING - main - ***** STARTING SERVER *****
2023-10-11 14:58:27,867 - root - 349 - WARNING - main - Server /usr/local/lib/python3.11/site-packages/synapse/app/homeserver.py version 1.94.0
2023-10-11 14:58:27,868 - root - 354 - INFO - main - Server hostname: server_name
2023-10-11 14:58:27,868 - root - 355 - INFO - main - Instance name: master
2023-10-11 14:58:27,868 - root - 356 - INFO - main - Twisted reactor: EPollReactor
2023-10-11 14:58:27,868 - synapse.app.homeserver - 350 - INFO - main - Setting up server
2023-10-11 14:58:27,869 - synapse.server - 338 - INFO - main - Setting up.
2023-10-11 14:58:27,891 - synapse.storage.databases - 66 - INFO - main - [database config 'master']: Checking database server
2023-10-11 14:58:27,894 - synapse.storage.databases - 69 - INFO - main - [database config 'master']: Preparing for databases ['main', 'state']
2023-10-11 14:58:27,894 - synapse.storage.prepare_database - 128 - INFO - main - ['main', 'state']: Checking existing schema version
2023-10-11 14:58:27,900 - synapse.storage.prepare_database - 132 - INFO - main - ['main', 'state']: Existing schema is 82 (+1 deltas)
2023-10-11 14:58:27,901 - synapse.storage.databases.main - 344 - INFO - main - Checking database for consistency with configuration...
2023-10-11 14:58:27,903 - synapse.storage.prepare_database - 426 - INFO - main - Applying schema deltas for v82
2023-10-11 14:58:27,906 - synapse.storage.prepare_database - 561 - INFO - main - Schema now up to date
2023-10-11 14:58:27,912 - synapse.storage.databases - 84 - INFO - main - [database config 'master']: Starting 'main' database
2023-10-11 14:58:27,960 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for pushers(id): 1
2023-10-11 14:58:27,962 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for deleted_pushers(stream_id): 1
2023-10-11 14:58:27,965 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for push_rules_stream(stream_id): 1
2023-10-11 14:58:27,978 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for e2e_cross_signing_keys(stream_id): 1
2023-10-11 14:58:27,978 - synapse.storage.databases.main.event_push_actions - 1153 - INFO - main - Searching for stream ordering 1 month ago
2023-10-11 14:58:27,979 - synapse.storage.databases.main.event_push_actions - 1157 - INFO - main - Found stream ordering 1 month ago: it's 0
2023-10-11 14:58:27,979 - synapse.storage.databases.main.event_push_actions - 1160 - INFO - main - Searching for stream ordering 1 day ago
2023-10-11 14:58:27,980 - synapse.storage.databases.main.event_push_actions - 1164 - INFO - main - Found stream ordering 1 day ago: it's 0
2023-10-11 14:58:27,983 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for push_rules(id): 1
2023-10-11 14:58:27,987 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for push_rules_enable(id): 1
2023-10-11 14:58:27,995 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for access_tokens(id): 1
2023-10-11 14:58:27,999 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for refresh_tokens(id): 1
2023-10-11 14:58:28,006 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for event_reports(id): 1
2023-10-11 14:58:28,008 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for device_lists_stream(stream_id): 1
2023-10-11 14:58:28,010 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for user_signature_stream(stream_id): 1
2023-10-11 14:58:28,014 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for device_lists_outbound_pokes(stream_id): 1
2023-10-11 14:58:28,017 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for device_lists_changes_in_room(stream_id): 1
2023-10-11 14:58:28,020 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for device_lists_remote_pending(stream_id): 1
2023-10-11 14:58:28,022 - synapse.storage.util.id_generators - 92 - INFO - main - Initialising stream generator for device_lists_changes_converted_stream_position(stream_id): 1
2023-10-11 14:58:28,023 - synapse.storage.databases - 101 - INFO - main - [database config 'master']: Starting 'state' database
2023-10-11 14:58:28,027 - synapse.storage.databases - 116 - INFO - main - [database config 'master']: prepared
2023-10-11 14:58:28,027 - synapse.server - 341 - INFO - main - Finished setting up.
2023-10-11 14:58:28,066 - synapse.push.pusher - 38 - INFO - main - email enable notifs: False
2023-10-11 14:58:28,071 - synapse.server - 870 - INFO - main - Connecting to redis (host='redis' port=6379) for external cache
2023-10-11 14:58:28,072 - synapse.replication.tcp.redis - 285 - INFO - main - Connecting to redis server redis:6379
2023-10-11 14:58:28,073 - synapse.federation.federation_server - 1374 - INFO - main - Registering federation EDU handler for 'm.device_list_update'
2023-10-11 14:58:28,074 - synapse.federation.federation_server - 1394 - INFO - main - Registering federation query handler for 'profile'
2023-10-11 14:58:28,075 - synapse.federation.federation_server - 1374 - INFO - main - Registering federation EDU handler for 'm.presence'
2023-10-11 14:58:28,075 - synapse.federation.federation_server - 1374 - INFO - main - Registering federation EDU handler for 'm.typing'
2023-10-11 14:58:28,076 - synapse.federation.federation_server - 1394 - INFO - main - Registering federation query handler for 'directory'
2023-10-11 14:58:28,077 - synapse.handlers.pagination - 110 - INFO - main - Setting up purge job with config: RetentionPurgeJob(interval=86400000, shortest_max_lifetime=None, longest_max_lifetime=None)
2023-10-11 14:58:28,077 - twisted - 275 - INFO - main - Redirected stdout/stderr to logs
2023-10-11 14:58:28,077 - synapse.app.homeserver - 173 - INFO - sentinel - Running
2023-10-11 14:58:28,081 - synapse.app.homeserver - 29 - INFO - sentinel - Set file limit to: 1048576
2023-10-11 14:58:28,090 - synapse.handlers.deactivate_account - 234 - INFO - user_parter_loop-0 - Starting user parter
2023-10-11 14:58:28,108 - synapse.util.caches.lrucache - 223 - INFO - sentinel - Expiring LRU caches after 1800 seconds
2023-10-11 14:58:28,109 - synapse.replication.tcp.redis - 285 - INFO - sentinel - Connecting to redis server redis:6379
2023-10-11 14:58:28,113 - synapse.app.homeserver - 216 - INFO - sentinel - Scheduling stats reporting for 3 hour intervals
2023-10-11 14:58:28,180 - synapse.storage.background_updates - 403 - INFO - background_updates-0 - Starting background schema updates for database master
2023-10-11 14:58:28,208 - synapse.push.pusherpool - 373 - INFO - start_pushers-0 - Started pushers
2023-10-11 14:58:28,209 - synapse.handlers.deactivate_account - 244 - INFO - user_parter_loop-0 - User parter finished: stopping
2023-10-11 14:58:28,212 - synapse.replication.tcp.redis - 119 - INFO - sentinel - Connected to redis
2023-10-11 14:58:28,212 - synapse.replication.tcp.redis - 131 - INFO - subscribe-replication-0 - Sending redis SUBSCRIBE for ['rc01-matrix.swarm.aukfood.ovh/USER_IP', 'rc01-matrix.swarm.aukfood.ovh']
2023-10-11 14:58:28,214 - synapse.replication.tcp.redis - 134 - INFO - subscribe-replication-0 - Successfully subscribed to redis stream, sending REPLICATE command
2023-10-11 14:58:28,217 - synapse.replication.tcp.redis - 139 - INFO - subscribe-replication-0 - REPLICATE successfully sent
2023-10-11 14:58:28,226 - synapse.storage.background_updates - 421 - INFO - background_updates-0 - No more background updates to do. Unscheduling background update task.
2023-10-11 14:58:58,006 - synapse.storage.databases.main.event_push_actions - 1307 - INFO - rotate_notifs-0 - Rotating notifications
2023-10-11 14:58:58,016 - synapse.storage.databases.main.event_push_actions - 1511 - INFO - rotate_notifs-0 - Rotating notifications up to: 1
2023-10-11 14:58:58,020 - synapse.storage.databases.main.event_push_actions - 1597 - INFO - rotate_notifs-0 - Rotating notifications, handling 0 rows
2023-10-11 14:58:58,040 - synapse.storage.databases.main.event_push_actions - 1682 - INFO - rotate_notifs-0 - Rotating notifications, deleted 0 push actions
2023-10-11 14:58:58,109 - synapse.util.caches.lrucache - 210 - INFO - LruCache._expire_old_entries-0 - Dropped 0 items from caches
2023-10-11 14:59:12,854 - twisted - 275 - INFO - sentinel - Received SIGTERM, shutting down.
2023-10-11 14:59:12,855 - synapse.storage.databases.main.lock - 106 - INFO - LockStore._on_shutdown-0 - Dropping held locks due to shutdown
2023-10-11 14:59:12,855 - synapse.storage.databases.main.lock - 117 - INFO - LockStore._on_shutdown-0 - Dropped locks due to shutdown
2023-10-11 14:59:12,855 - synapse.handlers.presence - 878 - INFO - presence.on_shutdown-0 - Performing _on_shutdown. Persisting 0 unpersisted changes
2023-10-11 14:59:12,855 - synapse.handlers.presence - 890 - INFO - presence.on_shutdown-0 - Finished _on_shutdown
2023-10-11 14:59:12,855 - synapse.app._base - 496 - INFO - sentinel - Shutting down...
2023-10-11 14:59:12,855 - synapse.replication.tcp.redis - 299 - INFO - sentinel - Connection to redis server redis:6379 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.
2023-10-11 14:59:12,856 - synapse.replication.tcp.redis - 197 - INFO - sentinel - Lost connection to redis
2023-10-11 14:59:12,856 - synapse.replication.tcp.redis - 299 - INFO - sentinel - Connection to redis server redis:6379 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.
2023-10-11 14:59:12,865 - twisted - 275 - INFO - sentinel - Main loop terminated.
DMRobertson commented 1 year ago

2023-10-11 14:58:27,906 - synapse.storage.prepare_database - 561 - INFO - main - Schema now up to date

Thanks for the full Synapse logs. This rules out a situation like #14401.

You should inspect your docker-compose logs, and maybe even the docker service logs to see if there any clues. Can you see what is shutting Synapse down and why?

So if i do docker service logs -f xxx_servicename I have only those logs

"By docker service logs" I mean docker's own logs. On many Linux distributions I would expect there to be a systemd service for docker whose logs you can see with e.g. journalctl -u docker. If there are no clues there, you could look at the wider system logs in the journal. Since Synapse is told to shutdown here:

2023-10-11 14:59:12,854 - twisted - 275 - INFO - sentinel - Received SIGTERM, shutting down.

I would look for log lines around that time. Try journalctl --since "2023-10-11 14:59:00".

Something told Synapse to shut down, so it did.

Ultimately though, this is the problem and it is outside of Synapse, so I don't think we can help you much. For this kind of support request, please ask in the Synapse Admins room. (If your homeserver is inoperable at the moment, you can sign up for an account on the Matrix.org homeserver in order to reach the support room. )