Closed IeuanK closed 2 years ago
Is this a new 'feature' in 1.65?
Do you see his behaviour when running 1.64?
Synaps just shuts down. See log; it literally just receives a SIGTERM and decides to shut itself down.
This is not a bug. Synapse gracefully shuts itself down when requested by a SIGTERM. (We could choose to ignore the signal, but that would make Synapse a bad citizen.)
You need to work out what's sending the SIGTERM and why it's doing so. I would guess that there is some kind of resource contention (RAM?) and a monitoring process terminates Synapse to reclaim those resources.
I would recommend setting up metrics monitoring with prometheus/grafana. That might offer clues as to why Synapse is asked to terminate.
I am running a docker stack with the official Synapse image in it.
What does "docker stack" mean---docker-compose? Are there any relevant logs from docker itself rather than the container which might be pertinent?
@DMRobertson
Do you see his behaviour when running 1.64?
I upgraded from v1.50.2 according to the docker images I have pulled, the issue did not appear there.
You need to work out what's sending the SIGTERM and why it's doing so. I would guess that there is some kind of resource contention (RAM?) and a monitoring process terminates Synapse to reclaim those resources.
The system has 48 cores and 173GB memory, I have never seen more than ~300% CPU usage at any given time, and it usually has over 100GB memory free. There is nothing actively pruning busy containers, only Nginx Proxy Manager, hif1/heisenbridge and mautrix/whatsapp have any way to connect to Synapse. No other containers, even ones that have much more resource usage, are getting shut down.
Whatever it is has to be either inside the matrixdotorg/synapse image, or one of the two bridges, but those show no logging besides the ordinary events either.
What does "docker stack" mean---docker-compose? Are there any relevant logs from docker itself rather than the container which might be pertinent?
I use docker swarm stacks, so yes docker compose. Below is my docker-compose file for this stack. Note the nginxpm
network that is used solely for the reverse proxy to the Synapse server name.
Docker itself keeps metrics that all look perfectly fine, Synapse uses so few resources I can't imagine that it's bottlenecked or being cut off in any way.
version: '3.7'
services:
synapse:
image: matrixdotorg/synapse:v1.65.0
deploy:
replicas: 1
environment:
- SYNAPSE_SERVER_NAME=<snip>
- SYNAPSE_REPORT_STATS=yes
volumes:
- /vmdata/services/matrix_synapse/data:/data
- /vmdata/services/matrix_synapse:/synapse
- /vmdata/services/matrix_synapse/data/py/shared_secret_authenticator.py:/usr/local/lib/python3.9/site-packages/shared_secret_authenticator.py
networks:
- default
- nginxpm
whatsapp:
image: dock.mau.dev/mautrix/whatsapp:latest
deploy:
replicas: 1
volumes:
- /vmdata/services/matrix_synapse:/synapse
- /vmdata/services/matrix_synapse/mautrix-whatsapp:/data
networks:
- default
heisenbridge:
image: hif1/heisenbridge:1.14
deploy:
replicas: 1
volumes:
- /vmdata/services/matrix_synapse:/synapse
- /vmdata/services/matrix_synapse/heisenbridge:/data
entrypoint: /data/heisenbridge-init.sh
networks:
- default
networks:
nginxpm:
external: true
common:
external: true
Do you see his behaviour when running 1.64?
I upgraded from v1.50.2 according to the docker images I have pulled, the issue did not appear there.
That's quite a wide version range; it'd be great if you could narrow that down by trying to reproduce the issue with intermediate releases.
I note that the image has a periodic HEALTHCHECK. I wonder if that is failing to respond all of a sudden? I can't find a solid reference for what docker will do if a HEALTHCHECK fails, but it sounds reasonable that docker (or some layer above it) would to stop and restart the container. Can you see anything in Synapse's logs which indicate requests to /heatlh
?
Additionally, the Dockerfile docs say
To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with
docker inspect
. Such output should be kept short (only the first 4096 bytes are stored currently).When the health status of a container changes, a
health_status
event is generated with the new status.
I suggest looking for evidence of failing healthchecks. In particular docker events --since TIMESTAMP
looks like it'll be useful, if you know roughly when synapse received one of these SIGTERMs.
Can you see anything in Synapse's logs which indicate requests to
/heatlh
?
Actually, you probably won't find anything here unless you have debug logging enabled:
So it's probably best to concentrate on docker inspect
or docker events
here.
The Synapse log does mention 'health':
matrix_synapse.1.cwo8bra2rt2u@beast | 2022-08-25 10:07:49,350 - synapse.util.httpresourcetree - 47 - INFO - sentinel - Attaching <synapse.rest.health.HealthResource object at 0x7f350a5c9f10> to path b'/health'
I ran docker events | grep synapse
to monitor for any events from this service, which does show a continuous cycle of the events below. Seems like this is just the health check, which is working fine. I added some extra line breaks and indents for readability:
2022-08-25T13:38:25.224044162+02:00
container exec_create: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (
com.docker.stack.namespace=matrix,
com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0,
com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0,
com.docker.swarm.service.name=matrix_synapse,
com.docker.swarm.task=,
com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v,
com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v,
execID=35811821f96fda310e55e8eff3d30547f4b3ea7eef7bcd79de1275a91e6cc7cf,
gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07,
image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889,
name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v,
org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md,
org.opencontainers.image.licenses=Apache-2.0,
org.opencontainers.image.source=https://github.com/matrix-org/synapse.git,
org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse
)
2022-08-25T13:38:25.224381317+02:00
container exec_start: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (
com.docker.stack.namespace=matrix,
com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0,
com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0,
com.docker.swarm.service.name=matrix_synapse,
com.docker.swarm.task=,
com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v,
com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v,
execID=35811821f96fda310e55e8eff3d30547f4b3ea7eef7bcd79de1275a91e6cc7cf,
gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07,
image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889,
name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v,
org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md,
org.opencontainers.image.licenses=Apache-2.0,
org.opencontainers.image.source=https://github.com/matrix-org/synapse.git,
org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse
)
2022-08-25T13:38:25.378195984+02:00
container exec_die f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (
com.docker.stack.namespace=matrix,
com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0,
com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0,
com.docker.swarm.service.name=matrix_synapse,
com.docker.swarm.task=,
com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v,
com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v,
execID=35811821f96fda310e55e8eff3d30547f4b3ea7eef7bcd79de1275a91e6cc7cf,
exitCode=0,
gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07,
image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889,
name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v,
org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md,
org.opencontainers.image.licenses=Apache-2.0,
org.opencontainers.image.source=https://github.com/matrix-org/synapse.git,
org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse
)
Important to note in the above that the container itself does not reboot, the create, start and die are just for the health check process. The task ID (container ID) stays the same.
If I exec into the synapse container and manually run curl http://localhost:8008/health
it just returns OK
, so that is also working from the inside
You've demonstrated that the healthcheck is currently succeeding. What we want to know is if the healthcheck failed just before Synapse received a SIGTERM.
From your logs:
matrix_synapse.1.mybcsb5py8xx@beast | 2022-08-25 05:31:39,819 - twisted - 274 - INFO - sentinel - Received SIGTERM, shutting down.
I suggest using docker events --since '2022-08-25T05:31:00'
.
I sadly don't seem to be getting any events for the task that was shut down, only the current one (cwo8bra2rt2u)
cwo8bra2rt2u matrix_synapse.1 matrixdotorg/synapse:v1.65.0 beast Running Running 2 hours ago
9k6elyuqfm12 \_ matrix_synapse.1 matrixdotorg/synapse:v1.65.0 beast Shutdown Complete 2 hours ago
css4r2i9s8he \_ matrix_synapse.1 matrixdotorg/synapse:v1.65.0 beast Shutdown Complete 2 hours ago
p4owu5h29kh2 \_ matrix_synapse.1 matrixdotorg/synapse:v1.65.0 beast Shutdown Complete 5 hours ago
y26oljfjpyin \_ matrix_synapse.1 matrixdotorg/synapse:v1.65.0 beast Shutdown Complete 5 hours ago
These are the current and shut down tasks, I've tried various timestamps and checked for the four IDs that were shut down, but no dice. I'll start up a manual log of the events to a file, then check that file when it next crashes.
I've now confirmed that it does go 'unhealthy' and then kills itself. Also interesting is that (as shown above as well) it creates a new task, that task sticks on 'preparing' for a long time, then dies, then a second task is created that just immediately goes to 'running'.
Now to figure out what could possibly be causing it to go unhealthy, and why it happens every few hours. I guess the next step is turning on debug logging for Synapse? It looks like it hits a 10-second timeout, but I saw no spike in resource usage on the host or in docker stats
.
2022-08-25T14:33:05.800870614+02:00 container exec_create: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v, com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, execID=d5042bc424362b673a67a6025e8baf16a0bf3e8c4049a8e1f6f075cd5c974b7e, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:33:05.801184313+02:00 container exec_start: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v, com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, execID=d5042bc424362b673a67a6025e8baf16a0bf3e8c4049a8e1f6f075cd5c974b7e, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:33:15.977885702+02:00 container health_status: unhealthy f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v, com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:33:18.035644396+02:00 container kill f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v, com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse, signal=15)
2022-08-25T14:33:18.484391657+02:00 container die f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v, com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, exitCode=0, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:33:47.167616606+02:00 container stop f2e8851dbbe5eac67f322f738f45e03ea392a626cc2e3235def5053065832023 (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=cwo8bra2rt2usa85btcjiw28v, com.docker.swarm.task.name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.cwo8bra2rt2usa85btcjiw28v, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:34:33.290960396+02:00 container destroy 5f35d2d2ef4bcc7d8fb1e9d7ec8540782371a5160e66c70e395b60ea3644f88c (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=y26oljfjpyinht0ceboggb4mi, com.docker.swarm.task.name=matrix_synapse.1.y26oljfjpyinht0ceboggb4mi, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.y26oljfjpyinht0ceboggb4mi, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:35:47.513400678+02:00 container create 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:35:48.499041604+02:00 container start 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:03.497398070+02:00 container exec_create: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, execID=6294e44a13d2b49d92da7acd05e1d1ff9913811f8d1f6d61d025841fc425c579, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:03.497622599+02:00 container exec_start: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, execID=6294e44a13d2b49d92da7acd05e1d1ff9913811f8d1f6d61d025841fc425c579, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:29.096652766+02:00 container exec_create: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, execID=136a98975bea8924e37996960086ec8d4ef9118b4ab1e8a40cfa0efcf529f31d, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:29.096977432+02:00 container exec_start: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, execID=136a98975bea8924e37996960086ec8d4ef9118b4ab1e8a40cfa0efcf529f31d, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:29.412375220+02:00 container exec_die 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, execID=136a98975bea8924e37996960086ec8d4ef9118b4ab1e8a40cfa0efcf529f31d, exitCode=0, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:29.426259367+02:00 container health_status: healthy 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:44.427902100+02:00 container exec_create: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, execID=46d4b83173fdb4226fc154652ec1bbf204a58b61c9f07bc4617cae492208bc11, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:44.428162624+02:00 container exec_start: /bin/sh -c curl -fSs http://localhost:8008/health || exit 1 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, execID=46d4b83173fdb4226fc154652ec1bbf204a58b61c9f07bc4617cae492208bc11, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
2022-08-25T14:36:44.530425801+02:00 container exec_die 1e0de4fd9e02684a6689833d96039652a5439ba47b1385914b42307eda2e316f (com.docker.stack.namespace=matrix, com.docker.swarm.node.id=jgp56j490qx9agdu1bjb4vfk0, com.docker.swarm.service.id=0yl3gflz4k58clxpcfd8n73h0, com.docker.swarm.service.name=matrix_synapse, com.docker.swarm.task=, com.docker.swarm.task.id=gcsav6el4nr0vac71zfrxx81y, com.docker.swarm.task.name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, execID=46d4b83173fdb4226fc154652ec1bbf204a58b61c9f07bc4617cae492208bc11, exitCode=0, gitsha1=738c11729afcb978b7d9551af3c4cf7990163d07, image=matrixdotorg/synapse:v1.65.0@sha256:edea7f61a110c2f0eb9aa7821d27f47d20f62c0f59364aea90c5d3381625b889, name=matrix_synapse.1.gcsav6el4nr0vac71zfrxx81y, org.opencontainers.image.documentation=https://github.com/matrix-org/synapse/blob/master/docker/README.md, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.source=https://github.com/matrix-org/synapse.git, org.opencontainers.image.url=https://matrix.org/docs/projects/server/synapse)
I've now confirmed that it does go 'unhealthy' and then kills itself.
To be explicit: Synapse doesn't send the SIGTERM: Synapse receives it, presumably from some part of your docker stack.
I guess the next step is turning on debug logging for Synapse?
That won't hurt, but I think it'll be easier to see what's going on from prometheus metrics because that gives you graphs over time.
Database contention might be a problem. I assume you're using SQLite rather than postgres as your database? (Some servers are bottlenecked by SQLite's write throughput and we strongly recommend using postgres for any "serious" deployments.)
I will set up Prometheus to see if that shows anything relevant, been meaning to do so anyway for monitoring my other containers so now's as good a time as any.
Database contention might be a problem. I assume you're using SQLite rather than postgres as your database? (Some servers are bottlenecked by SQLite's write throughput and we strongly recommend using postgres for any "serious" deployments.)
I am in fact using postgres, per the docs' recommendation. The official postgres docker image runs in a separate swarm stack. I seem to have accidentally cut the network out when redacting the docker-compose I posted, but there is a network connection to that database that's working as expected. No significant resource usage for that container either, so can't imagine that'd be causing any issues here.
To be honest I think your infrastructure is being a bit aggressive by killing Synapse on a single failed health check. Alternatively, maybe you should override the timeout for the healthcheck.
It's not unknown for Synapse to stop responding for 5 seconds, particularly when python garbage-collects.
I have to admit that I'm not sure how to do that in this case.
The Dockerfile has HEALTHCHECK --start-period=5s --interval=15s --timeout=5s
in it, with no retries configured. According to the Dockerfile reference it defaults to three retries.
I don't have anything set up to change this behavior, other than enabling the zfs storage driver my docker itself is all default.
I'll see if I can override the healthcheck in the docker-compose, maybe I just need to increase the timeout further?
I have added the following to the Synapse service:
healthcheck:
test: curl -fSs http://localhost:8008/health || exit 1
interval: 30s
retries: 5
start_period: 10s
timeout: 30s
Will see if this makes a difference
Did that sort you out?
I think so, all tasks in my Matrix stack have been running since I rebooted the docker daemon for something 5 days ago.
So I guess this issue can be closed, with increasing the health check timeout as the solution. Thanks @DMRobertson and @richvdh for helping me figure this out!
Description
I am running a docker stack with the official Synapse image in it.
Absolutely randomly, Synaps just shuts down. See log; it literally just receives a SIGTERM and decides to shut itself down. Everything up to where the copied log starts is just normal Synapse logging
It's driving me absolutely nuts, why would it possibly do this? Is this a new 'feature' in 1.65?
Steps to reproduce
Homeserver
My own homeserver
Synapse Version
v1.65.0
Installation Method
Docker (matrixdotorg/synapse)
Platform
Docker on debian
Relevant log output
Anything else that would be useful to know?
No response