mrlt8 / docker-wyze-bridge

WebRTC/RTSP/RTMP/LL-HLS bridge for Wyze cams in a docker container
GNU Affero General Public License v3.0
2.47k stars 151 forks source link

Retry MQTT Connection #1047

Open rmaes4 opened 7 months ago

rmaes4 commented 7 months ago

Problem

I am using MQTT to communicate motion events with Scrypted (which is also acting as my MQTT broker). When I reboot my Raspberry Pi, Docker launches both a docker-wyze-bridge container and a scrypted container at the same time. This creates a race condition where docker-wyze-bridge attempts to connect to the scrypted MQTT broker before scrypted has finished initializing. Thus, the MQTT connection fails for docker-wyze-bridge. The problem is that docker-wyze-bridge does not re-attempt this connection, it just gives up. A retry mechanism is needed to handle a scenario like this or to cover cases where there may be a short loss of connection.

Potential Solution

https://github.com/mrlt8/docker-wyze-bridge/blob/0b7de5997ad90de5bb8bf47be89c9110e342ac54/app/wyzebridge/mqtt.py#L77-L92

I quickly looked at the source code and from what I can tell, this is where the MQTT connection is created. I also took a look at the documentation for the paho-mqtt library and found the below function:

RECONNECT_DELAY_SET

reconnect_delay_set(min_delay=1, max_delay=120)

The client will automatically retry connection. Between each attempt it will wait a number of seconds between min_delay and max_delay.

When the connection is lost, initially the reconnection attempt is delayed of min_delay seconds. It’s doubled between subsequent attempt up to max_delay.

The delay is reset to min_delay when the connection complete (e.g. the CONNACK is received, not just the TCP connection is established).

I believe that this issue could be easily solved by modifying the mqtt_sub_topic function to the following:

     @mqtt_enabled 
     def mqtt_sub_topic(m_topics: list, callback) -> Optional[paho.mqtt.client.Client]: 
         """Connect to mqtt and return the client.""" 
         client = paho.mqtt.client.Client() 

         client.username_pw_set(MQTT_USER, MQTT_PASS or None) 
         client.user_data_set(callback) 
         client.on_connect = lambda mq_client, *_: ( 
             mq_client.publish(f"{MQTT_TOPIC}/state", "online"), 
             [mq_client.subscribe(f"{MQTT_TOPIC}/{m_topic}") for m_topic in m_topics], 
         ) 
         client.will_set(f"{MQTT_TOPIC}/state", payload="offline", qos=1, retain=True) 

        """MQTT RECONNECT OPTION"""
         client.reconnect_delay_set(min_delay=1, max_delay=120)
        """MQTT RECONNECT OPTION"""

         client.connect(MQTT_HOST, int(MQTT_PORT or 1883), 30) 
         client.loop_start() 

         return client 

I would test this myself and create a PR, but I don't have this project setup for local development. I am hoping this simple change can resolve this issue.

mrlt8 commented 7 months ago

I believe that would only work if the connection is lost, and paho throws an exception if the broker is not up yet - probably a [Errno 111] Connection refused...?

I've added a retry option to the wrapper that defaults to 3 attempts before disabling MQTT but should be configurable MQTT_RETRIES if you need more attempts.

teixeluis commented 6 months ago

@mrlt8 is there any specific setting for not limiting the mqtt connection retries, or should I just put a stupidly large value in the MQTT_RETRIES variable?

Because of nightly router restarts I run into the retries expiring:

[WyzeBridge] ⏰ Timed out connecting to ovalesublime-west-cam.
[WyzeBridge] [MQTT] [Errno 101] Network is unreachable
[ovalesublime-south-cam] [-13] IOTC_ER_TIMEOUT
[ovalesublime-west-cam] [-13] IOTC_ER_TIMEOUT
[ovalesublime-south-cam] [MQTT] [Errno 101] Network is unreachable
[ovalesublime-west-cam] [MQTT] timed out. Retrying 1/3...
[WyzeBridge] [MQTT] timed out. Retrying 2/3...
[ovalesublime-south-cam] [MQTT] timed out. Retrying 2/3...
[ovalesublime-west-cam] [MQTT] [Errno 101] Network is unreachable
[WyzeBridge] [MQTT] timed out. Retrying 3/3...
[ovalesublime-west-cam] [MQTT] timed out. Retrying 3/3...
[ovalesublime-south-cam] [MQTT] timed out. Retrying 3/3...
[WyzeBridge] [MQTT] 3/3 retries failed. Disabling MQTT.
[WyzeBridge] ⏰ Timed out connecting to ovalesublime-south-cam.
[WyzeBridge] 🎉 Connecting to WyzeCam V3 - ovalesublime-west-cam on 192.168.1.88

and for something that is an add-on / service, I believe it is more useful to never give up..

mrlt8 commented 6 months ago

Good point. Will see if we can keep retrying for certain exceptions.

cfelicio commented 4 months ago

I tried even with a extremely large number of retries, and no luck. Every time I restart home assistant, I have to restart the docker bridge (I'm running them on separate platforms) to get MQTT to work again. It happens with cameras that are turned off, if I turn them on again via the Wyze app, then they start working.