mpromonet / webrtc-streamer

WebRTC streamer for V4L2 capture devices, RTSP sources and Screen Capture
https://webrtcstreamer.agreeabletree-365b9a90.canadacentral.azurecontainerapps.io/?layout=2x2
The Unlicense
2.92k stars 595 forks source link

100% CPU across eight cores #377

Open gdhgdhgdh opened 3 years ago

gdhgdhgdh commented 3 years ago

Hello,

Host OS: Ubuntu 20.04 LTS, machine: AWS c5a.2xlarge

I'd like to report a behaviour on 0.3.3 and the new 0.3.5 binaries on x86_64 - with the launch command /usr/bin/webrtc-streamer -H 0.0.0.0:8083 -N 1 -n faraway -u preview I am able to cause CPU usage to jump to 100% across all cores, making the system slow to respond.

I recorded perf reports in two scenarios:

good

A single client loads an embedded HTML video, and presses Play once - this could run for hours with zero problem and the CPU usage is very modest (perhaps 10% of overall system capacity)

good.txt

bad

Two clients on different tabs of the same client browser reload the page and press 'Play' a couple of times. It never takes more than three attempts to cause the CPU usage to quickly rocket to 100% over the period of about 5 seconds.

Perf output was recorded using perf record -F 99 -a -g -- sleep 40 and it seems like there are enough symbols to provide useful output \o/

bad.txt

The attached outputs were generated from perf report -n --stdio

The bad output seems to be spending a huge amount of time in vp8 encoder code - is it possible to force H.264, purely to see if it reacts better?

The HTML fragment is pretty much a copy + paste from the README.md of this project:

[....]
        <script src="preview/adapter.min.js"></script>
        <script src="preview/webrtcstreamer.js"></script>
[....]

                    <div class="stripSection">
                        <script>
                            window.onload         = function() {
                                this.webRtcServer = new WebRtcStreamer("video", "https://hostname.of.my.service");
                                webRtcServer.connect('faraway','','');
                            }
                            window.onbeforeunload = function() { this.webRtcServer.disconnect() }
                        </script>
                               <video id="video"></video>
                    </div>
mpromonet commented 3 years ago

Hi,

I tried to reproduce the problem you described without success. I am not sure to undersand what you means by press 'Play' a couple of times. Does it means using video controls pressing pause then play then pause then play... ? Monitoring with top -p $(pidof webrtc-streamer) -H I see 3 EncoderQueue thread per session. Closing all session, after detection of peerconnection close detection, all EncoderQueue stopped. You may look to admin.html page to see how many peerconnection are active.

Best Regards, Michel.

gdhgdhgdh commented 3 years ago

Rather than me trying (and likely failing) to construct a standalone test case, would it be useful if I provisioned a machine which exhibits the behaviour and simply provided login creds? The /dev/video0 input is this case is the v4l2loopback device which I have simply named preview and am feeding it 1704 x 720 video.

I've been able to reproduce the problem simply by refreshing /webrtcstreamer.html?video=faraway at least four times. The /admin.html will show the correct number of Peers, but as the CPU consumption goes through the roof, the webrtc-streamer process will be culled by the OS and restarted; this is how I've had to fudge around the problem.

mpromonet commented 3 years ago

Hi,

I added some prometheus metrics to see if I can reproduce your problem. Maybe I don't understand what means "refreshing at least four times", using F5 ? or pushing play/pause control ?

Best Regards, Michel.

gdhgdhgdh commented 3 years ago

Excellent, thank you - yes simply by opening one tab to /webrtcstreamer.html and then pressing F5, let the page reload, F5 again... and so on until the /admin.html shows there are five PeerConnections then I will see the CPU consumption race up.

I have been able to reproduce this problem locally on my own laptop and mercifully it's a simple setup:

$ sudo modprobe v4l2loopback
$ v4l2loopback-ctl set-caps "video/x-raw, format=UYVY, width=1280, height=720, framerate=(fraction)30/1" /dev/video10
$ gst-launch-1.0 videotestsrc ! v4l2sink device=/dev/video10

Now I launch cpulimit -f -l 300 -k -- ./webrtc-streamer in another shell - the cpulimit stops my machine from becoming unresponsive

Finally, load http://localhost:8000/webrtcstreamer.html?video=OBS%20Video%20Source and then F5 a few times - cpulimit should kill the webrtc-streamer process before it runs away with your CPU.

mpromonet commented 3 years ago

Hi,

This is strange because hitting F5 call window.onbeforeunload that close the current webrtc streams. When I look to admin.html hitting F5 on webcrtstreamer.html, I see always 1 connection. However removing the onbeforunload allow to have several connection, before ICE disconnection.

This may depends on the browser you are using ?

Best Regards, Michel.

gdhgdhgdh commented 3 years ago

OK, I'm pleased to say that I've been able to replicate the problem using Firefox 85 either on my own machine, or from a Mac on the same LAN.

The test is just as above - once webrtc-streamer is running, I open a tab for /admin.html, and now I move to the Mac to open

http://10.0.0.123:8000/webrtcstreamer.html?video=OBS%20Video%20Source

As soon as the test-card video appears, I hit COMMAND+R to refresh the page, and I see an additional Peer listed. After 4 or 5 refreshes webrtc-streamer CPU will drag the machine down (if the cpulimit wrapper didn't intervene)

The behaviour is indeed slightly different with both Safari 14.0.3 and Chrome 88 on the Mac; refreshing the page works OK - i.e. NO additional Peers listed - it does successfully hang up each time - I can refresh the screen maybe up to ten times but beyond that, important API calls fail and no more video sessions will be served, e.g.

gdh@gdh-x260:~$ curl http://localhost:8000/api/version
"v0.3.3/Linux-x86_64 civetweb@v1.11 webrtc@ae93edf243-dirty live555helper@b4d86f2"

gdh@gdh-x260:~$ curl http://localhost:8000/api/getPeerConnectionList
[... timeout ...]

If I CTRL-C the process and try to restart, then port 8000 is still in use:

[...]
answer:"v0.3.3/Linux-x86_64 civetweb@v1.11 webrtc@ae93edf243-dirty live555helper@b4d86f2"
uri:/api/getPeerConnectionList
uri:/api/version
answer:"v0.3.3/Linux-x86_64 civetweb@v1.11 webrtc@ae93edf243-dirty live555helper@b4d86f2"
uri:/api/getPeerConnectionList
^CExiting...
SIGINT
gdh@gdh-x260:~/webrtc-streamer-v0.3.3-Linux-x86_64-Release$ cpulimit -f -l 360 -k -- ./webrtc-streamer
Version:v0.3.3/Linux-x86_64 civetweb@v1.11 webrtc@ae93edf243-dirty live555helper@b4d86f2
nullLogger level:3
[000:000][6980] (audio_device_generic.cc:18): BuiltInAECIsAvailable: Not supported on this platform
[000:000][6980] (audio_device_generic.cc:28): BuiltInAGCIsAvailable: Not supported on this platform
[000:000][6980] (audio_device_generic.cc:38): BuiltInNSIsAvailable: Not supported on this platform
HTTP Listen at 0.0.0.0:8000
cannot bind to 0.0.0.0:8000: 98 (Address already in use)
Cannot Initialize start HTTP server exception:null context when constructing CivetServer. Possible problem binding to port.
Exit
No process found

gdh@gdh-x260:~/webrtc-streamer-v0.3.3-Linux-x86_64-Release$ netstat -plant | grep 8000.*LISTEN
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      6637/./webrtc-strea 

Somehow a child process is being spawned but not reaped successfully?