ayufan / camera-streamer

High-performance low-latency camera streamer for Raspberry PI's
327 stars 50 forks source link

Ghost webrtc streams #55

Open philippebourcier opened 1 year ago

philippebourcier commented 1 year ago

@ayufan OK, this time a real bug. πŸ˜…

I noticed that when I watch a WebRTC stream and I relaunch the browser (or just go back to the index and click again on /webrtc), most of the time, I get a ghost UDP stream...

Here's a log view of the phenomenon : util/http/http.c: HTTP8080/6: Client connected 192.168.1.36 (fd=10). util/http/http.c: HTTP8080/6: Request 'GET' '/webrtc' '' util/http/http.c: HTTP8080/6: Client disconnected 192.168.1.36. util/http/http.c: HTTP8080/7: Client connected 192.168.1.36 (fd=11). util/http/http.c: HTTP8080/7: Request 'POST' '/webrtc' '' output/webrtc/webrtc.cc: rtc-tonwouvavrrzosxdjudo: Stream requested. util/http/http.c: HTTP8080/7: Client disconnected 192.168.1.36. util/http/http.c: HTTP8080/8: Client connected 192.168.1.36 (fd=12). util/http/http.c: HTTP8080/8: Request 'POST' '/webrtc' '' output/webrtc/webrtc.cc: rtc-tonwouvavrrzosxdjudo: Answer received. util/http/http.c: HTTP8080/8: Client disconnected 192.168.1.36.

util/http/http.c: HTTP8080/9: Client connected 192.168.1.36 (fd=13). util/http/http.c: HTTP8080/9: Request 'GET' '/webrtc' '' util/http/http.c: HTTP8080/9: Client disconnected 192.168.1.36. util/http/http.c: HTTP8080/0: Client connected 192.168.1.36 (fd=4). util/http/http.c: HTTP8080/0: Request 'POST' '/webrtc' '' output/webrtc/webrtc.cc: rtc-wpesdkknzakphjitiknw: Stream requested. util/http/http.c: HTTP8080/0: Client disconnected 192.168.1.36. util/http/http.c: HTTP8080/1: Client connected 192.168.1.36 (fd=5). util/http/http.c: HTTP8080/1: Request 'POST' '/webrtc' '' output/webrtc/webrtc.cc: rtc-wpesdkknzakphjitiknw: Answer received. util/http/http.c: HTTP8080/1: Client disconnected 192.168.1.36.

When things go well, you get this line : output/webrtc/webrtc.cc: rtc-vohzdpemelhsnkocaalk: Client removed: stream closed.

But most of the time you don't... At some point I had like 10 ghosted streams in parallel and my WiFi didn't like it. πŸ˜…

Here's a graph with 3 streams, the non-ghosted one is in red, the ghosted ones in green and blue : image

So it looks like a kind of stream timeout handling or detection is missing on the C side (or the JS needs to send a signal to tell the server to stop streaming)...

philippebourcier commented 1 year ago

I added : window.addEventListener("beforeunload", function() { pc.close(); }); at the end of the startWebRTC() javascript function and it seems to do the trick but there might be more elegant ways of doing this. πŸ€·β€β™‚οΈ

JoelBecker1998 commented 1 year ago

Can confirm the issue.

ayufan commented 1 year ago

@philippebourcier This is not bad workaround. Can you open PR?

RemiNV commented 1 year ago

I also noticed that after opening the webrtc stream in a Chrome tab, then closing it, the camera-streamer daemon is still using a fair bit of CPU time; more than before opening the stream.

Having the client close the stream does seem necessary, but given that the server may keep using resources indefinitely (for example if the client crashes or doesn't execute the JS unload), I agree some stream timeout / detection that the remote is gone is necessary.

jpiccari commented 1 year ago

@ayufan This is still an issue in the latest release. I noticed libdatachannel does not call pc->onStateChange in all cases (see repro steps below). Also, at least in Firefox and iOS Safari, the onunload event workaround isn't necessary as the clients seem to already send this in many (all?) cases.

I'm new to WebRTC so forgive my complete ignorance here... I added some hacky code to the connection initiation to iterate the clients and check their peer connection state for disconnected/failed/closed and close any that are found. However, the status never got updated by libdatachannel and so I end up with a massive list of clients. My understanding is that ICE servers should be used to signal between client and server but I'm not clear if there is a heartbeat in the spec (or built into libdatachannel) or if that is something that we need to implement. My guess is that no such heartbeat exists and there is no built-in provision for detecting client failures/networking issues which is why we get ghost streams.

Finally, here are some repro steps that reliably produce orphaned streams.

  1. Open /webrtc stream
  2. Close browser tab/app

This works on Chrome, Edge, iOS Safari. But strangely doesn't work on Firefox (Firefox behaves nicely and closes the stream). I didn't try the unload event but I'm sure it would be buggy since it doesn't work for network connectivity issues or for browser crashes, dead batteries, etc. Would be much preferable to come up with a robust solution for identifying and cleaning up orphaned stream.

jpiccari commented 1 year ago

I've got a fix that I've been testing out for a couple hours now with many dirty "client" disconnects. Before we dive into the theory of the change so we can debate if it is correct, I think it would be helpful to clarify some terms I'm about to use which make the conversation easier but aren't necessarily correct for a discussion on webrtc.

  1. client - I'm using this to refer to the viewer of the stream. They send a request to /webrtc (the signalling server) which initiates the peer connection setup on both ends.
  2. server - Refers to the side of the rtc connection sending video data.
  3. signalling server - This is the main server hosting the endpoints for /webrtc, /status, etc

Alright, so the change I have been running includes client-side changes (update to the javascript) and server-side changes. On the server, when webrtc_server() is called, I create a new thread that loops every 10s (planning to make this configurable via cli args) and acquires a lock for to iterate webrtc_clients and removes each client where their last heartbeat is older than some staleness threshold.

To track heartbeats, I have added a last_heartbeat_s field to the Client class which stores the server timestamp of the most recent heartbeat message from the client. Finally, I have updated the javascript in webrtc.html to include periodic heartbeat messages being sent every 1s.

One interesting disadvantage with this change is that backwards compatibility is a trade-off. This change will break users that use their own javascript logic to communicate with the signalling server. Because it relies on heartbeats, existing services will need to update to include the heartbeat logic in their client-side code or their streams will die after the timeout period (just a few seconds). One solution would be to bump version numbers or otherwise note that it is not backwards compatible. Another would be to include a default last heartbeat threshold that is very long (perhaps even infinite) and provide an option to configure it down as a sort of opt-in to automatic session cleanup. I don't have any preference here so just let me know the direction and I can put out a PR.

AndyHazz commented 1 year ago

Just to add a bit of emphasis to this issue, I was away from home recently, and using camera-streamer through crowsnest/mainsail and wireguard VPN - the webrtc stream ate through my 20gb of mobile data allowance in a day even though I was probably only viewing the stream for a few minutes :(

It seemed to keep the stream open even after restarting the phone, without viewing the camera stream again ... although that might be an edge case from the way the wireguard VPN works.

ayufan commented 1 year ago

Sorry to hear that. We will fix it soon.

On Mon, Sep 4, 2023 at 5:12β€―PM AndyHazz @.***> wrote:

Just to add a bit of emphasis to this issue, I was away from home recently, and using camera-streamer through crowsnest/mainsail and wireguard VPN - the webrtc stream ate through my 20gb of mobile data allowance in a day even though I was probably only viewing the stream for a few minutes :(

It seemed to keep the stream open even after restarting the phone, without viewing the camera stream again ... although that might be an edge case from the way the wireguard VPN works.

β€” Reply to this email directly, view it on GitHub https://github.com/ayufan/camera-streamer/issues/55#issuecomment-1705428865, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASOSQNTHVKOTE42C63P54TXYXVVJANCNFSM6AAAAAAXMVXAQU . You are receiving this because you were mentioned.Message ID: @.***>

jasonmacdonald commented 11 months ago

I'm often running into these ghost streams; when I check the bandwidth used by my Pi 4, it's often up in the 20 Mb/s range in only a few days. In comparison, a single stream should only be using around 1Mb/s. I must continue to shut down crowsnest and restart it to clear them out.

Is there a short term workaround?

jpiccari commented 11 months ago

I fixed this back in August with the addition of the liveness checks. The PR is linked above if you want to try it out. There is some discussion about how webrtc works in that PR which might be why it never got merged.

ayufan commented 11 months ago

I plan to merge this with some tweaks.

On Sat, 9 Dec 2023 at 09:29, Joshua Piccari @.***> wrote:

I fixed this back in August with the addition of the liveness checks. The PR is linked above if you want to try it out. There is some discussion about how webrtc works in that PR which might be why it never got merged.

β€” Reply to this email directly, view it on GitHub https://github.com/ayufan/camera-streamer/issues/55#issuecomment-1848325643, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASOSQKFGPR24KZPBZL2J3TYIQOM7AVCNFSM6AAAAAAXMVXAQWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBYGMZDKNRUGM . You are receiving this because you were mentioned.Message ID: @.***>

Samconboy commented 4 months ago

Has this been fixed?

Just got stuck for 25gb of data via octoanywhere by switching to webrtc and I believe it could be related

jdobes commented 1 day ago

I'd like to see this fixed too. The solution from PR #91 works but problem is that it still requires adding support to clients (Mainsail, Fluidd, Mobileraker...) otherwise the stream will be constantly resetting. For Fluidd there is a workaround where you can set source of the camera stream to be an iframe and the /webrtc endpoint HTML already contains the javascript with ping-pong logic.