Open jaroslawjanas opened 3 years ago
if your container has dependencies, and requires service A to be functional. best if you broaden their health checks to include other containers?
e.g.
php-apache-application: healthcheck => check-can-access 120.0.1:80 && check-can-access mysql-db:3306
mysql-db: healthcheck => check-can-access 127.0.0.1:3306
where check-can-access is something like wait-for-it or simple telnet/netcatt check
@jaroslawjanas have you managed to do it?
@jaroslawjanas have you managed to do it?
No, instead I added a health check of my own in the docker-compose file.
[REDACTED]:
image: [REDACTED]
container_name: [REDACTED]
restart: unless-stopped
labels:
- autoheal=true
healthcheck:
test: ["CMD-SHELL", "curl --silent --output nul --show-error --fail https://[REDACTED].com && exit 0 || exit 1"]
interval: 60s
timeout: 30s
retries: 5
start_period: 100s
This fixed the problem I was struggling with.
I thought about it too, I need to find the right health check in my case.. (I want to restart Selenium container if the container(s) that use it fail to start)
I overwrite entrypoint script with this one for adding support for two new "autoheal" labels: "master" and "slave".
When one of the containers is unhealthy, ALL of them are restarted. First, container with "master" label (should be only one) is restarted, and then, all the others with "slave" label.
I use it for openvpn container (master) and transmission and soulseek containers encrypted thru it (slaves).
"true" label remains with the same funcionality.
Despite of no thanks to my post nor any like, seems to be people using this modify script. So I paste a working version of it. Isn't merged with latest autoheal version, but it's working with Alpine 3.18 and Docker 20.10.23.
Your Docker compose needs to be like this:
entrypoint: /entry.sh # Adds feature: restart all containers (master first) on unhealthy one (master or slave)
command: "autoheal"
Finally I merged script with latest version.
My docker compose looks like:
autoheal:
container_name: autoheal
image: willfarrell/autoheal:latest
restart: unless-stopped
networks:
- socket_proxy
security_opt:
- no-new-privileges:true
volumes:
- /etc/localtime:/etc/localtime:ro
- $DOCKERDIR/autoheal/entry.sh:/entry.sh:ro
entrypoint: /entry.sh # Adds feature: restart all containers (master first) on unhealthy one (master or slave)
command: "autoheal"
environment:
- AUTOHEAL_INTERVAL=15
- AUTOHEAL_RETRIES=40
- AUTOHEAL_START_PERIOD=300
- AUTOHEAL_DEFAULT_STOP_TIMEOUT=15
- WEBHOOK_URL=https://api.telegram.org/bot$TELEGRAM_NOTIFIER_BOT_TOKEN/sendMessage
- WEBHOOK_JSON_KEY=chat_id":"$TELEGRAM_NOTIFIER_CHAT_ID","text
- DOCKER_SOCK=tcp://socket_proxy:2375
@baroka Thanks for your hard work.
Without wishing to detract from the good work for this new entrypoint, which adds new features such as a telegram notification!!! I would like to point out that in the latest versions of docker-compose, I believe from 2.20 onwards, there seems to be a similar feature, I am testing it and it is too early to express opinions on it.
Seems that is possible to restart the whole stack, at least seems to, my setup for the stack is
services:
my-master-service:
image: ...
container_name: my_1st_container_name
...
healthcheck:
test: "ping -c 1 www.google.com || exit 1"
interval: 60s
timeout: 5s
retries: 3
restart: unless-stopped
my-1st-slave-service:
image: ...
container_name: my_2nd_container_name
...
network_mode: "service:my-master-service"
depends_on:
my-master-service:
condition: service_started
restart: true
healthcheck:
test: "curl --fail http://localhost:my_2nd_container_service_port || exit 1"
interval: 30s
timeout: 10s
retries: 5
restart: unless-stopped
my-2nd-slave-service:
image: ...
container_name: my_3rd_container_name
...
network_mode: "service:my-master-service"
depends_on:
my-master-service:
condition: service_started
restart: true
healthcheck:
test: "curl --fail http://localhost:my_3rd_container_service_port || exit 1"
interval: 30s
timeout: 10s
retries: 5
restart: unless-stopped
Like I said I need more testing, so for now better to use the modified entrypoint!!!
Is it possible to restart containers that are dependent on the container that failed the health check? Say I have containers
A
andB
that haveC
as a dependency. In simple termsA
andB
need a healthyC
to function properly. Is it possible to restartA
,B
ifC
's health check is negative?If not please consider it as a potential enhancement.