willfarrell / docker-autoheal

Monitor and restart unhealthy docker containers.
MIT License
1.31k stars 225 forks source link

Optional Label not working #119

Open tamimology opened 10 months ago

tamimology commented 10 months ago

I never had the autoheal.stop.timeout label to work. The container is restarted within 10 sec of being unhealthy regardless of how much I assign to this label

for example, homeassistant compose looks like:

  homeassistant:
    container_name: homeassistant
    restart: always
    privileged: true
    environment:
      - DOCKER_HOST=$SOCKET
      - PUID=$PUID
      - PGID=$PGID
      - TZ=$TZ
    volumes:
      - $PERSIST/homeassistant:/config:rw
    ports:
      - 8123:8123
    labels: 
      autoheal: ture
      autoheal.stop.timeout: 240 # 4min
    healthcheck:
      test: curl -fSs http://127.0.0.1:8123 || exit 1
      start_period: 90s
      timeout: 10s
      interval: 5s
      retries: 3
    network_mode: host
    working_dir: /config
    depends_on:
       - mariadb
       - influxdb

and the logs for autoheal look like:

2023-11-22T06:09:36.876051179Z Monitoring containers for unhealthy status in 600 second(s)
2023-11-22T06:32:25.025755719Z 22-11-2023 17:32:25 Container /homeassistant (a24ff1edcbef) found to be unhealthy - Restarting container now with 240s timeout
2023-11-22T06:32:55.044963574Z 22-11-2023 17:32:25 Restarting container a24ff1edcbef failed
2023-11-22T06:32:56.861499489Z 2023-11-22 00:32:55,185 [INFO] apprise: Loaded 1 entries from memory://
2023-11-22T06:32:56.862506190Z 2023-11-22 00:32:56,511 [INFO] apprise: Sent Pushover notification to ALL_DEVICES.
2023-11-22T06:33:00.265430581Z 22-11-2023 17:33:00 Container /homeassistant (a24ff1edcbef) found to be unhealthy - Restarting container now with 240s timeout
2023-11-22T06:33:30.283927711Z 22-11-2023 17:33:00 Restarting container a24ff1edcbef failed
2023-11-22T06:33:31.717332678Z 2023-11-22 00:33:30,375 [INFO] apprise: Loaded 1 entries from memory://
2023-11-22T06:33:31.717737031Z 2023-11-22 00:33:31,680 [INFO] apprise: Sent Pushover notification to ALL_DEVICES.
2023-11-22T06:33:54.291123258Z 22-11-2023 17:33:54 Container /homeassistant (a24ff1edcbef) found to be unhealthy - Restarting container now with 240s timeout
2023-11-22T06:34:18.103011982Z 2023-11-22 00:34:16,986 [INFO] apprise: Loaded 1 entries from memory://
2023-11-22T06:34:18.103912563Z 2023-11-22 00:34:18,094 [INFO] apprise: Sent Pushover notification to ALL_DEVICES.

In fact, what it did was send the restart signal, regardless of the wait time, and it did restart the container even after it became healthy. It should wait for the time defined, i.e. 240s, and then recheck if it is still unhealthy, then it will restart.