mrlt8 / docker-wyze-bridge

WebRTC/RTSP/RTMP/LL-HLS bridge for Wyze cams in a docker container
GNU Affero General Public License v3.0
2.67k stars 170 forks source link

Lost connection to the bridge... #989

Open lviciedo opened 1 year ago

lviciedo commented 1 year ago

Anyone else seeing this after yesterdays update?

asinla commented 1 year ago

Same thing here. Loses connection all the time.

mrlt8 commented 1 year ago

Can you post some of the logs?

c0f commented 1 year ago

I get the "Lost connection to the bridge..." error when connecting via Nginx Proxy Manager (NPM) but I don't get the error if I connect directly to the WebUI port of my Docker container. So in my case this feels like an NPM configuration issue.

Wyze-bridge starts working again after refreshing the page. The 'Lost connection' error comes back after 90 seconds.

Only the WebUI is affected, wyze-bridge continues to record and take snapshots while the 'Lost connection' error is displayed.

There are no error messages in my wyze-bridge logs, just the 'client stopped reading' and 'new client reading' messages.

` 🚀 DOCKER-WYZE-BRIDGE v2.3.17

[WyzeBridge] 🔍 Could not find local cache for 'auth' [WyzeBridge] ☁ Fetching 'auth' from the Wyze API...

maclarel commented 1 year ago

My version of this issue is solved. No idea if it will work for others. TL;DR below.

This, at least in my case, is caused by Nginx not handling event-stream content without further configs, and further aggravated by default 60s read timeouts for proxied traffic.

In short, you need to add the following config options to whichever server is proxying your WebUI/API traffic:

proxy_buffering off;
proxy_cache off;
proxy_read_timeout <very_high_value>; # e.g. 3600

I'm also seeing similar behaviour to this, only through Nginx when using it as a reverse proxy (direct access is fine). I've tried setting an upstream with keepalives, but no luck. I'll consistently get Lost connection to the bridge... pop up in the UI after a minute and all my streams freeze until I do a manual refresh.

This seems to be rooted in requests that go to <host>:5000/api/sse_status. This works fine without the proxy, but will consistently fail with it in place.

Adding an explicit proxy_pass for /api in the nginx config to attempt to handle this results in a timeout only for the sse_status endpoint. Everything else works fine :shrug:

> GET /api/sse_status HTTP/1.1
> Host: <host>:5000
> User-Agent: curl/8.4.0
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
< HTTP/1.1 200 OK
< Server: nginx/1.25.2
< Date: Sun, 15 Oct 2023 16:13:20 GMT
< Content-Type: text/event-stream; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
<
* TLSv1.3 (IN), TLS alert, close notify (256):
* transfer closed with outstanding read data remaining
* Closing connection
* TLSv1.3 (OUT), TLS alert, close notify (256):
curl: (18) transfer closed with outstanding read data remaining

Initial suspicion is that this is just spinning forever on the infinite sleep loop here if the status can't be retrieved:

def sse_generator(sse_status: Callable) -> Generator[str, str, str]:
    """Generator to return the status for enabled cameras."""
    cameras = {}
    while True:
        if cameras != (cameras := sse_status()):
            yield f"data: {json.dumps(cameras)}\n\n"
        sleep(1)

I'll keep poking at this and update if I figure anything out...

Update: This issue seems to be something to do with nginx handling event-stream data. Setting this endpoint to return a hardcoded string (example below) avoids the scenario being seen here since it's sufficient to satisfy the comparison at https://github.com/mrlt8/docker-wyze-bridge/blob/33fe98e1b01a822223b4df5c378f327d05a2c7ab/app/wyzebridge/web_ui.py#L16

   @app.route("/api/sse_status")
    def sse_status():
        return Response(
            "foo"
        )

Update 2: Got it, kinda. After tracking this down to event-stream data being the problem, the solution for me appears to be adding the following headers to the location directive used for /api in my nginx config:

proxy_buffering off;
proxy_cache off;

Source: https://stackoverflow.com/questions/13672743/eventsource-server-sent-events-through-nginx, though note that the rest of the config options noted don't appear to be needed in my case.

I still get intermittent "Lost connection to the bridge..." errors, but they seem to automatically recover when the UI does its next applyPreferences or update_img which happens every few seconds, so it's good enough for me until someone can do a more in-depth investigation here.

Here's the full config (with some redactions) that should allow this all to work with basic auth, TLS, and some access restrictions based on the network it runs on (note this is trivial to spoof). docker-compose.yml is mostly stock other than setting WB_HLS_URL to https://<HOST>:8888/ and setting all port bindings to 127.0.0.1:<PORT>:<PORT>.

  server {
        listen <HOST_IP>:5000 ssl;
        ssl_certificate /etc/letsencrypt/live/<HOST>/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/<HOST>/privkey.pem;
        allow 192.168.1.0/24;
        deny all;

        location /api {
                proxy_buffering off;
                proxy_cache off;
                proxy_pass      http://127.0.0.1:5000/api;
        }

        location / {
                auth_basic      "You didn't say the magic word...";
                auth_basic_user_file /etc/nginx/.htpasswd;
                proxy_pass      http://127.0.0.1:5000/;
        }
  }

  server {
        listen <HOST_IP>:8888 ssl;
        ssl_certificate /etc/letsencrypt/live/<HOST>/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/<HOST>/privkey.pem;
        allow 192.168.1.0/24;
        deny all;
        location / {
                proxy_pass      http://127.0.0.1:8888/;
        }
  }

But yeah, not sure what's causing the "Lost connection" issues at this point as there are no errors in the browser console or wyze-bridge container logs :shrug: At least it's just an annoyance at this point :neutral_face: AFAICT this is still due to intermittent failures to retrieve sse_status that show NS_ERROR_NET_PARTIAL_TRANSFER in the network tab of the browser after waiting for a response for 60 seconds. When the response finally arrives (a minute later) the payload will appear in the Response tab, however this correlates directly with the "Lost connection to the bridge..." error, and then resolves itself as soon as the next request is opened.

Update 3: Looks like the intermittent drops, with everything above considered and still required, are related to Nginx proxy timeouts. Notably, since adding proxy_read_timeout 3600; (or other arbitrarily high value) for the server directive handling docker-wyze-bridge this seems to be "resolved". Quite possible this is also affecting default configs of Nginx Proxy Manager if a suitably high read timeout isn't being set.

waltershome commented 11 months ago

I'm seeing this same issue, do you know where I would go to set the parameters you specified when running the wyze bridge as a Home Assistant Add-On?