mrlt8 / docker-wyze-bridge

WebRTC/RTSP/RTMP/LL-HLS bridge for Wyze cams in a docker container
GNU Affero General Public License v3.0
2.45k stars 151 forks source link

2.9.4 is back to disconnections after some long runtimes #1240

Open vipergts450 opened 3 weeks ago

vipergts450 commented 3 weeks ago

Hello again @mrlt8 ,

There seems to be a regression that has appeared in 2.9.4, as my connections to each of my cameras (Cam v3 - WYZE_CAKP2JFUS and Cam Pan - WYZECP1_JEF) once again is unstable after some time. The streams do not recover once they fail and remain down until I log into the web software and do a "Restart > Connection to Cameras." Both cameras had their streams stay up since 21:05 ET Friday May 31 until 16:42 and 17:41 ET today Sunday June 2. 2.9.2-2.9.3 were both very stable.

mrlt8 commented 3 weeks ago

Anything in the logs when the disconnects happen? Rebooting the cameras will sometimes help when they start having connection issues.

vipergts450 commented 3 weeks ago

I've never had to reboot the cameras, they continue to work in the Wyze app itself the whole time. Restarting the container brings the feed back for the services using your bridge. Log does not even show a disconnect, only some "super slow" messages throughout the time it's on but nothing about any losses of connection. It's odd and I'll keep watching the logs and update here.

On Sun, Jun 2, 2024, 9:22 PM mrlt8 @.***> wrote:

Anything in the logs when the disconnects happen? Rebooting the cameras will sometimes help when they start having connection issues.

— Reply to this email directly, view it on GitHub https://github.com/mrlt8/docker-wyze-bridge/issues/1240#issuecomment-2144115191, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOKNMVLFWGZLUS7Y4HHMBRTZFPAOPAVCNFSM6AAAAABIVMY3LWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBUGEYTKMJZGE . You are receiving this because you authored the thread.Message ID: @.***>

vipergts450 commented 3 weeks ago

The NVR that uses the RTSP feeds for my 2 cameras lost cam2 (Cam Pan) this morning and it did not recover/restart on its own. Log shows that cam1 also had some issues overnight.

2024-06-03T08:52:15.250447977Z [WyzeBridge] 📖 New client reading from cam1
2024-06-03T08:52:18.565974237Z [cam1] WARNING: Audio pipe closed
2024-06-03T08:52:19.066458332Z [cam1] [CONTROL] ERROR - error='[-20010] AV_ER_INVALID_SID', cmd='_bitrate'
2024-06-03T08:52:19.549802604Z [cam1] [Exception] Did not receive a frame for 20s
2024-06-03T08:52:19.581171027Z [WyzeBridge] ❌ '/cam1' stream is down
2024-06-03T08:52:19.581382595Z [WyzeBridge] 📕 Client stopped reading from cam1
2024-06-03T08:52:19.581487292Z [WyzeBridge] 🎉 Connecting to WyzeCam V3 - cam1 on 192.168.2.52
2024-06-03T08:52:19.590883163Z [WyzeBridge] ✅ '/cam1 stream is UP! (3/3)
2024-06-03T08:52:19.593586761Z [WyzeBridge] ❌ '/cam1' stream is down
2024-06-03T08:52:19.593818183Z [WyzeBridge] 🎉 Connecting to WyzeCam V3 - cam1 on 192.168.2.52
2024-06-03T08:52:39.611164947Z [WyzeBridge] ⏰ Timed out connecting to cam1.
2024-06-03T08:52:40.616948119Z [WyzeBridge] 🎉 Connecting to WyzeCam V3 - cam1 on 192.168.2.52
2024-06-03T08:53:00.099522018Z [cam1] [-13] IOTC_ER_TIMEOUT
2024-06-03T08:53:00.639021953Z [WyzeBridge] ☁️ Fetching 'cameras' from the Wyze API...
2024-06-03T08:53:01.067391113Z [WyzeBridge] [API] Fetched [3] cameras
2024-06-03T08:53:01.070408702Z [WyzeBridge] 💾 Saving 'cameras' to local cache...
2024-06-03T08:53:02.076510557Z [WyzeBridge] 🎉 Connecting to WyzeCam V3 - cam1 on 192.168.2.52
2024-06-03T08:53:22.094724454Z [WyzeBridge] ⏰ Timed out connecting to cam1.
2024-06-03T08:53:23.100011935Z [WyzeBridge] 🎉 Connecting to WyzeCam V3 - cam1 on 192.168.2.52
2024-06-03T08:53:25.332461743Z [cam1] 📡 Getting 180kb/s HD stream (H264/15fps) via LAN mode (WiFi: 67%) FW: 4.36.11.8391 🔒
2024-06-03T08:53:25.340755273Z [cam1] 🔊 Audio Enabled - ALAW > AAC/16,000Hz
2024-06-03T08:53:25.521739343Z [cam1] WARNING: Skipping wrong frame_size at start of stream [frame_size=1]
2024-06-03T08:53:25.774910342Z [WyzeBridge] ✅ '/cam1 stream is UP! (3/3)
2024-06-03T08:53:25.885505220Z [WyzeBridge] 📖 New client reading from cam1
2024-06-03T08:53:26.543657108Z [WyzeBridge] 📖 New client reading from cam1
2024-06-03T08:53:27.193627040Z [WyzeBridge] 📕 Client stopped reading from cam1
2024-06-03T09:39:57.749640845Z [cam2] WARNING: clear buffer
2024-06-03T09:40:24.005531263Z [WyzeBridge] 📕 Client stopped reading from cam2
2024-06-03T09:40:24.006792268Z [WyzeBridge] 📖 New client reading from cam2
2024-06-03T09:40:25.872958085Z [WyzeBridge] 📕 Client stopped reading from cam2
2024-06-03T10:04:20.975127231Z [cam1] [video] super slow
2024-06-03T10:04:20.993370285Z [cam1] WARNING: clear buffer
2024-06-03T10:55:35.883678487Z [cam1] [video] super slow

The stream for cam2 remained down at the NVR since the "clear buffer" at 09:39:57 Z. cam1 is still up despite the same error at 10:04:20 Z

What is interesting is that I opened the webUI and the stream recovered then as if it helped the container notice that it wasn't working. After logging into the WebUI, this message came across:

2024-06-03T11:44:40.555076118Z [WyzeBridge] ❌ '/cam2' stream is down

at this point the container did a reconnection as if it only just figured out the stream was down when I logged into the Web UI.

Stream came back up and NVR noticed and restarted its recording duties as well.

mrlt8 commented 3 weeks ago

I believe this is related to #1234.

Thanks for the clue @vipergts450! We were using avClientCleanBuf when the video falls behind, however, the SDK seems to indicate that it's a blocking function:

 \attention (1) This API is a blocking function.

So that could be why it was getting stuck/blocked.

I've updated the edge branch to use avClientCleanLocalBuf instead which doesn't seem to block.

Would appreciate testing and feedback!

vipergts450 commented 3 weeks ago

@mrlt8 thank you for the notice on the latest dev. Pulled it just now and started up. Will watch for you!

mrlt8 commented 3 weeks ago

@vipergts450 you'll need to pull the mrlt8/wyze-bridge:edge image as the dev branch has the mtx related changes.

vipergts450 commented 3 weeks ago

Ah cool, rerunning now on edge branch, my mistake on that.

g13092 commented 3 weeks ago

@mrlt8 , thanks for pointing me here. This behavior sounds very consistent with what I have seen.

Here's a snip of my logs today where v3-49 and pasture cams were both "hung" in Frigate. This was right before I called up the Wyzebridge web interface. That v3-49 cam error repeats back quite a while.

[v3-49] [-90] IOTC_ER_DEVICE_OFFLINE [WyzeBridge] 👻 V3-49 is offline. [WyzeBridge] V3-49 will cooldown for 10s. [pasture] [video] super slow [pasture] WARNING: clear buffer [pasture] [-20021] AV_ER_SENDIOCTRL_ALREADY_CALLED

Also, I caught the Pasture cam at -1 like you saw in the other thread.

vipergts450 commented 3 weeks ago

We're still solid after 24 hours on the edge branch!

mrlt8 commented 3 weeks ago

I've actually run into a bug stopping the child process which I'm looking into...

g13092 commented 3 weeks ago

4 of 5 cams crashed overnight with edge

On Tue, Jun 4, 2024, 9:54 AM mrlt8 @.***> wrote:

I've actually run into a bug stopping the child process which I'm looking into...

— Reply to this email directly, view it on GitHub https://github.com/mrlt8/docker-wyze-bridge/issues/1240#issuecomment-2147744116, or unsubscribe https://github.com/notifications/unsubscribe-auth/APR3TCPHUOKPFWWSMY2VCWDZFXIJPAVCNFSM6AAAAABIVMY3LWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBXG42DIMJRGY . You are receiving this because you commented.Message ID: @.***>

mrlt8 commented 3 weeks ago

@g13092 was it the child process error in the logs?

vipergts450 commented 3 weeks ago

Edge has still been working ok for me. I noticed that watchtower pulled a newer one yesterday mid-day and that one was stable as of this writing.

g13092 commented 3 weeks ago

Sorry, I lost that log. I can try it again. Can you help me trap it? I'm running this in a docker and I don't have any bind mounts created. I've been using portainer to view the active log, but I can terminal into the active container instead. I looked around to the "latest" I have running now and I don't see where the logs are being stored. I could also bind mount the logs to a local drive...

On Wed, Jun 5, 2024 at 9:16 AM mrlt8 @.***> wrote:

@g13092 https://github.com/g13092 was it the child process error in the logs?

— Reply to this email directly, view it on GitHub https://github.com/mrlt8/docker-wyze-bridge/issues/1240#issuecomment-2150124431, or unsubscribe https://github.com/notifications/unsubscribe-auth/APR3TCPFANMZDYPQGJTTXX3ZF4MS3AVCNFSM6AAAAABIVMY3LWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJQGEZDINBTGE . You are receiving this because you were mentioned.Message ID: @.***>

vipergts450 commented 3 weeks ago

2.9.6 started doing the following this morning after about 18 hours of container runtime:

2024-06-07T13:29:01.029406511Z Connection closed by remote. Closing connection.
2024-06-07T13:29:21.060418388Z Connection closed by remote. Closing connection.
2024-06-07T13:29:41.092377070Z Connection closed by remote. Closing connection.
2024-06-07T13:30:01.124238092Z Connection closed by remote. Closing connection.
2024-06-07T13:30:21.155633956Z Connection closed by remote. Closing connection.
2024-06-07T13:30:41.191788893Z Connection closed by remote. Closing connection.
2024-06-07T13:31:01.223657063Z Connection closed by remote. Closing connection.
2024-06-07T13:31:21.256266583Z Connection closed by remote. Closing connection.
2024-06-07T13:31:41.286905424Z Connection closed by remote. Closing connection.
2024-06-07T13:32:01.318423022Z Connection closed by remote. Closing connection.
2024-06-07T13:32:21.351024823Z Connection closed by remote. Closing connection.
2024-06-07T13:32:41.382450887Z Connection closed by remote. Closing connection.
2024-06-07T13:33:00.528575866Z [cam1] [video] super slow
2024-06-07T13:33:00.529768173Z [cam1] WARNING: clear buffer
2024-06-07T13:33:01.414335827Z Connection closed by remote. Closing connection.
2024-06-07T13:33:10.226457381Z [cam1] [video] super slow
2024-06-07T13:33:21.445629766Z Connection closed by remote. Closing connection.
2024-06-07T13:33:41.476269316Z Connection closed by remote. Closing connection.
2024-06-07T13:34:01.508596932Z Connection closed by remote. Closing connection.
2024-06-07T13:34:21.538166661Z Connection closed by remote. Closing connection.

Getting constant NVR drop outs and reconnections on cam1 (Cam v3)

cam2 (Cam Pan) dropped out about 30 min ago and never recovered:

2024-06-07T13:06:36.978180024Z [cam2] [-22] IOTC_ER_SESSION_CLOSE_BY_REMOTE
2024-06-07T13:06:37.997912820Z [cam2] [AssertionError] Stream Stopped
2024-06-07T13:06:38.558633439Z [WyzeBridge] 📖 New client reading from cam1
2024-06-07T13:06:38.558831170Z [WyzeBridge] 🎉 Connecting to WyzeCam Pan - cam2 on 192.168.2.50
2024-06-07T13:06:38.891916272Z Connection closed by remote. Closing connection.
2024-06-07T13:06:40.754653780Z [cam2] 📡 Getting 180kb/s HD stream (H264/15fps) via LAN mode (WiFi: 85%) FW: 4.10.9.3006 🔒
2024-06-07T13:06:40.755809250Z [cam2] 🔊 Audio Enabled [Source=MULAW/8,000Hz]
2024-06-07T13:06:40.755997051Z [cam2] [AUDIO] Re-Encode Enabled [AUDIO_CODEC=aac]
2024-06-07T13:06:40.770035284Z [cam2] WARNING: clear buffer
2024-06-07T13:06:58.921927220Z Connection closed by remote. Closing connection.
2024-06-07T13:07:18.952970407Z Connection closed by remote. Closing connection.
2024-06-07T13:07:38.986168206Z Connection closed by remote. Closing connection.
mrlt8 commented 3 weeks ago

Those are all related to AV_ER_SESSION_CLOSE_BY_REMOTE.

I've noticed those errors in the past with the 2024 firmware if multiple clients attempt to view the stream.

Does opening the web ui get it unstuck?

vipergts450 commented 3 weeks ago

Opening the WebUI does not kickstart it again as it used to when we were testing edge builds. Cam1 has been just periodically connecting and reconnecting, which is not ideal but it's relatively acceptable. cam2 has a frozen image from last night, so it's been "down" for a lot longer than the logs indicate.

mrlt8 commented 2 weeks ago

hmm, do you have ON_DEMAND=False? I'm able to replicate the error by running the latest firmware and starting multiple streams with the same resolution, but they seem to re-connect on their own.

It seems like wyze changed something on the 2024 firmware that will only allow one stream per resolution so streaming the main and sub resolution is still allowed, but streaming the same resolution will kick out the other user.

vipergts450 commented 2 weeks ago

Yes I have been running On_Demand=False for a while to eliminate that variable and try to keep the stream steady, probably since 2.9.1 or whenever I started reporting issues to you.

On Sat, Jun 8, 2024, 9:47 AM mrlt8 @.***> wrote:

hmm, do you have ON_DEMAND=False? I'm able to replicate the error by running the latest firmware and starting multiple streams with the same resolution, but they seem to re-connect on their own.

It seems like wyze changed something on the 2024 firmware that will only allow one stream per resolution so streaming the main and sub resolution is still allowed, but streaming the same resolution will kick out the other user.

— Reply to this email directly, view it on GitHub https://github.com/mrlt8/docker-wyze-bridge/issues/1240#issuecomment-2156043680, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOKNMVK6PEF5DNB47KDZSTDZGMDQHAVCNFSM6AAAAABIVMY3LWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJWGA2DGNRYGA . You are receiving this because you were mentioned.Message ID: @.***>

mrlt8 commented 2 weeks ago

Are you still getting AV_ER_SESSION_CLOSE_BY_REMOTE after updating to 2.9.6?

Does downgrading back to one of the edge builds help?

mrlt8/wyze-bridge@sha256:9a200e3f3d49ecbd1b1ef8d46d035b77023961f4beed7278d7160846d54f5755
vipergts450 commented 2 weeks ago

2.9.7 has been much better behaved - any big change there from 2.9.6? I didn't see much in the commits that pertained to the encoding/stream receiving side.

mrlt8 commented 2 weeks ago

No, the 297 changes were mostly related to PTZ commands.

vipergts450 commented 2 weeks ago

Hard to explain, but 2.9.7 has been running great for 3 days...

vipergts450 commented 2 weeks ago

I still get a few periodic reconnections but have not had a permanent failure yet

On Tue, Jun 11, 2024, 9:55 AM mrlt8 @.***> wrote:

No, the 297 changes were mostly related to PTZ commands.

— Reply to this email directly, view it on GitHub https://github.com/mrlt8/docker-wyze-bridge/issues/1240#issuecomment-2160830673, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOKNMVKRVDOHQKTRHY5UQ4DZG36T3AVCNFSM6AAAAABIVMY3LWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRQHAZTANRXGM . You are receiving this because you were mentioned.Message ID: @.***>

g13092 commented 2 weeks ago

I just moved over to 2.9.7 and am having issues with the doorbell only...

[WyzeBridge] 🎉 Connecting to WyzeCam Doorbell - Back Door on 192.168.1.6 [WyzeBridge] 🎉 Connecting to WyzeCam Doorbell - Back Door on 192.168.1.6 [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:55] "GET /snapshot/back-door.jpg?1718327155555 HTTP/1.1" 200 - [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:56] "GET /snapshot/new-south-cam.jpg?1718327155554 HTTP/1.1" 200 - Connection closed by remote. Closing connection. [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:56] "GET /snapshot/v3-49.jpg?1718327155554 HTTP/1.1" 200 - [back-door] [-20015] AV_ER_SESSION_CLOSE_BY_REMOTE [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:57] "GET /snapshot/new-north-cam.jpg?1718327155556 HTTP/1.1" 200 - [WyzeBridge] 🎉 Connecting to WyzeCam Doorbell - Back Door on 192.168.1.6 [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:57] "GET /snapshot/back-door.jpg?1718327155555 HTTP/1.1" 200 - [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:57] "GET /snapshot/driveway.jpg?1718327155556 HTTP/1.1" 200 - [back-door] 📡 Getting 120kb/s SD stream (H264/20fps) via LAN mode (WiFi: 88%) FW: 4.25.1.333 🔒 [back-door] Re-encoding using libx264 [transpose='clock'] [WyzeBridge] [CONTROL] SET new-south-cam state=start [WyzeBridge] [MQTT] {'status': 'error', 'response': 'connected', 'value': 'start'} [WyzeBridge] [CONTROL] SET driveway state=start [WyzeBridge] [MQTT] {'status': 'error', 'response': 'connected', 'value': 'start'} [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:58] "GET /snapshot/new-south-cam.jpg?1718327155554 HTTP/1.1" 200 - [WyzeBridge] [CONTROL] SET back-door state=disable [WyzeBridge] Disabling back-door Error marking filters as finished [back-door] Stream stopped [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:59] "GET /snapshot/new-north-cam.jpg?1718327155556 HTTP/1.1" 200 - [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:05:59] "GET /snapshot/driveway.jpg?1718327155556 HTTP/1.1" 200 - [back-door] [AssertionError] Stream Stopped [WyzeBridge] [CONTROL] SET back-door state=enable [WyzeBridge] Enabling back-door [WyzeBridge] [CONTROL] SET pasture state=enable [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:06:00] "GET /snapshot/v3-49.jpg?1718327155554 HTTP/1.1" 200 - [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:06:17] "GET /snapshot/pasture.jpg?1718327155555 HTTP/1.1" 200 - [WyzeBridge] 192.168.1.17 - - [14/Jun/2024 01:06:20] "GET /snapshot/pasture.jpg?1718327155555 HTTP/1.1" 200 -

192.168.1.17 is where the webui is open.

vipergts450 commented 1 week ago

Ok after several days of consistent running, we are now once again receiving [-13] IOTC_ER_TIMEOUT on the Cam v3 every so often. Both the Cam v3 and Cam Pan are also throwing WARNING: clear buffer and [video] super slow as well pretty often now, every couple minutes.

Here's how the log goes:

[cam1] WARNING: Audio pipe closed
[cam1] [CONTROL] ERROR - error='[-20018] AV_ER_CLIENT_EXIT', cmd='_bitrate'
[cam1] Stream stopped

This is followed by a series of reconnection attempts which come back as [-13] IOTC_ER_TIMEOUT and then eventually it reconnects for a few minutes before the pattern repeats.