ossrs / srs

SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.
https://ossrs.io
MIT License
24.74k stars 5.28k forks source link

WebRTC: Frequently crash when using WebRTC (TCP) #3926

Closed DancingPanda666 closed 3 months ago

DancingPanda666 commented 5 months ago

Describe the bug Using WebRTC(TCP) is very likely to encounter SRS crash issue, whereas using UDP is safe and sound

Version SRS5

To Reproduce

  1. use SRS build-in web player to play webrtc (tcp)
  2. F5 to reload web player for several times
  3. SRS crash

Expected behavior working fine under webrtc (tcp)

Screenshots image

Additional context

# main config for srs.
# @see full.conf for detail config.
listen              1935;
max_connections     7500;
srs_log_tank        file;
srs_log_file        ./objs/logs/srs.log;
daemon              off;

inotify_auto_reload  on;
auto_reload_for_docker on;

http_api {
    enabled         on;
    listen          1985;
}

http_server {
    enabled         on;
    listen          8080;
    dir             ./objs/nginx/html;
}

stats {
    network         0;
    disk            sda sdb xvda xvdb;
}

rtc_server {
    enabled on;
    listen 8000;
    tcp {
        enabled on;
        listen 8000;
    }
    protocol  tcp;
    candidate $SRS_RTC_SERVER_CANDIDATE;
}

vhost __defaultVhost__ {
    min_latency on;
    tcp_nodelay on;

    publish {
        mr      off;
    }

    play{
        gop_cache on;
        queue_length 5;
        gop_cache_max_frames 2500;
        mw_latency 100;
    }

    rtc {
        enabled     on;
        rtmp_to_rtc on;
        rtc_to_rtmp on;
    }
}

Also, here're some relevant issue posts which look like also not yet been resolved

winlinvip commented 3 months ago

I believe this bug could caused from issues with the signaling connection and the media connection, as they use separate TCP connections. This could lead to crashes due to the interaction between these two components on the server.

Identifying a simpler method to reproduce the issue is crucial; if we can reliably replicate the problem, we can address it directly. Conversely, without a way to stably reproduce it, finding a solution becomes much more challenging due to the need to guess.

We recognize the risks associated with using TCP for WebRTC. We are exploring enhanced solutions, similar to the smart pointer concept in C++11. However, at present, there is no definitive solution or 'silver bullet' for this issue. Thus, the most effective strategy remains to reproduce the issue so we can address it directly.

DancingPanda666 commented 3 months ago

Hi, based on previous testing...

  1. start playing 20*WebRTC Streams (can be different stream names) in 1 page
  2. refresh this page for 2~3 times, must encuonter SRS crash issue that's most efficient way to reproduce the issue,

if 1 page only plays 1 webrtc stream, have to refresh lots of times and not absolutely trigger the bug

winlinvip commented 3 months ago

It would be beneficial if you could simplify the reproduction steps. Starting 20 players to reproduce the problem could be very complex. Perhaps, using network conditioning tools to introduce loss or delay in the network could allow the issue to be reproduced with just one or two players. Please try to reproduce it by using a publisher, for example. With the network condition tools, you might manage to reproduce it with only one publisher or possibly two.

I prefer using the publisher to reproduce the issue because using a player always requires a publisher as well, introducing complexity with two components involved. By reproducing the issue directly through a publisher, there's no need for a player, making the process simpler.

winlinvip commented 3 months ago

Duplicated to #3784