Addon should auto restart

I got those errors last night after a brief electricity shortage :

websockets.exceptions.ConnectionClosedOK: sent 1000 (OK); then received 1000 (OK)
2024-11-16 23:54:17,127 - asyncio              - ERROR   - Future exception was never retrieved
future: <Future finished exception=ConnectionClosedOK(Close(code=1000, reason=''), Close(code=<CloseCode.NORMAL_CLOSURE: 1000>, reason=''), False)>
websockets.exceptions.ConnectionClosedOK: sent 1000 (OK); then received 1000 (OK)
2024-11-16 23:54:17,127 - asyncio              - ERROR   - Future exception was never retrieved
future: <Future finished exception=ConnectionClosedOK(Close(code=1000, reason=''), Close(code=<CloseCode.NORMAL_CLOSURE: 1000>, reason=''), False)>
websockets.exceptions.ConnectionClosedOK: sent 1000 (OK); then received 1000 (OK)
2024-11-16 23:54:17,145 - asyncio              - ERROR   - Future exception was never retrieved
future: <Future finished exception=ConnectionClosedOK(Close(code=1000, reason=''), Close(code=<CloseCode.NORMAL_CLOSURE: 1000>, reason=''), False)>
websockets.exceptions.ConnectionClosedOK: sent 1000 (OK); then received 1000 (OK)
2024-11-16 23:54:18,280 - Starting tydom2mqtt
2024-11-16 23:54:18,281 - Hassio environment detected: loading configuration from /data/options.json
2024-11-16 23:54:18,281 - Validating configuration ({

2024-11-16 23:54:18,280 - Starting tydom2mqtt - I restarted manually and no issue after that.

The add-on should crash on error to allow HA to reboot it, or reboot itself, otherwise it's not solid. On the original mrwiwi version the add-on rebooted itself (forever.py rebooted the main script after a crash), IMHO it was a lot more resilient (but it's not working anymore)

Could you please allow the add-on to reboot itself after 2 errors for example ?

Thanks in advance and for the good work !

For the time being I created a systemd service with a python script that monitors the docker add-on logs and restart it on a supervised installation :

You can set it up with that bash script (thanks chatgpt) :

#!/bin/bash

# Variables
SERVICE_NAME="monitor_tydom"
SCRIPT_PATH="/usr/local/bin/monitor_tydom.py"
SERVICE_FILE="/etc/systemd/system/${SERVICE_NAME}.service"

# Étape 1 : Créer le script Python
cat << 'EOF' > $SCRIPT_PATH
#!/usr/bin/env python3

import subprocess
import time

# Configuration
SEARCH_TERM = "tydom2mqtt"
ERROR_KEYWORD = "ERROR"

def get_container_name(search_term):
    try:
        # Trouve le conteneur qui correspond au terme de recherche
        result = subprocess.run(
            ["docker", "ps", "--format", "{{.Names}}"],
            stdout=subprocess.PIPE,
            text=True,
            check=True
        )
        containers = result.stdout.splitlines()
        for container in containers:
            if search_term in container:
                return container
    except subprocess.CalledProcessError as e:
        print(f"Error retrieving container list: {e}")
    return None

def monitor_logs(container_name):
    try:
        # Ouvrir un flux continu des logs Docker
        process = subprocess.Popen(
            ["docker", "logs", "-f", container_name],
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True
        )

        for line in process.stdout:
            # Vérifie si "ERROR" est dans les logs
            if ERROR_KEYWORD in line:
                print(f"Error detected: {line.strip()}")
                restart_container(container_name)

    except Exception as e:
        print(f"Exception occurred: {e}")

def restart_container(container_name):
    try:
        print(f"Restarting container {container_name}...")
        subprocess.run(["docker", "restart", container_name], check=True)
        print(f"Container {container_name} restarted successfully.")
    except subprocess.CalledProcessError as e:
        print(f"Failed to restart container: {e}")

if __name__ == "__main__":
    container_name = get_container_name(SEARCH_TERM)
    if container_name:
        print(f"Monitoring logs for container: {container_name}")
        while True:
            monitor_logs(container_name)
            # Pause pour éviter une boucle trop rapide en cas d'erreur
            time.sleep(5)
    else:
        print(f"No container found with search term: {SEARCH_TERM}")
EOF

chmod +x $SCRIPT_PATH

# Étape 2 : Créer le fichier systemd
cat << EOF > $SERVICE_FILE
[Unit]
Description=Monitor Tydom2MQTT Docker logs and restart on errors
After=docker.service
Requires=docker.service

[Service]
ExecStart=$SCRIPT_PATH
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
User=root

[Install]
WantedBy=multi-user.target
EOF

# Étape 3 : Activer le service
systemctl daemon-reload
systemctl enable ${SERVICE_NAME}.service
systemctl start ${SERVICE_NAME}.service

echo "Service ${SERVICE_NAME} créé et démarré avec succès !"

fmartinou / tydom2mqtt

Addon should auto restart #224