Closed sanderegg closed 1 month ago
Related to #2140
The swarm is already configured to have zero downtime per service (i.e. a given service gets turned off ONLY when the new one is started). The problem might be that even if services are ready, the state between services is not ready. For example, the new webserver is updated correctly but traffik proxy has still not detected it. That would cause a wrong gateway failure on a front-end request
Testing like so:
duplicate of #5614
Use-case:
references:
graylog entries related to failed e2e
the e2e of isolve-mpi failed with the webserver returning a 500 for listing projects. one can see in the logs that the webserver was restarting at that moment.
Docker reference:
docker swarm starts a new service (the replacing webserver) and waits until it is healthy. once healthy it closes the replaced service, waits 10seconds and then kills it if it is still around.
[x] : restarting webserver works
[ ] : restarting database
[ ] : restarting other subsystems