ossrs / srs

SRS is a simple, high-efficiency, real-time media server supporting RTMP, WebRTC, HLS, HTTP-FLV, HTTP-TS, SRT, MPEG-DASH, and GB28181.
https://ossrs.io
MIT License
25.69k stars 5.38k forks source link

Origin server cluster configuration error, SRS core dump #3114

Closed WenzhiShao closed 2 years ago

WenzhiShao commented 2 years ago

Note: Please read FAQ before file an issue, see 2716

Note: Before asking a question, please refer to the FAQ, specifically 2716.

Description (描述)

Please describe your issue here (描述你遇到了什么问题)
Make sure to maintain the markdown structure.

  1. SRS Version (版本): 5.0
    Make sure to maintain the markdown structure.

  2. SRS Log (日志):
    Make sure to maintain the markdown structure.

    [2022-07-15 12:27:38.295][Trace][8959][8m199219] client finished.
    [2022-07-15 12:27:38.295][Trace][8959][98p183s4] TCP: clear zombies=1 resources, conns=1, removing=0, unsubs=0
    [2022-07-15 12:27:38.295][Trace][8959][8m199219] TCP: disposing #0 resource(HttpStream)(0x556fc0f2f6b0), conns=1, disposing=1, zombies=0
    [2022-07-15 12:27:42.637][Trace][8959][622au47p] Hybrid cpu=0.00%,11MB
    [2022-07-15 12:27:44.798][Trace][8959][it129s36] RTMP client ip=xxx, fd=12
    [2022-07-15 12:27:44.835][Trace][8959][it129s36] complex handshake success
    [2022-07-15 12:27:44.861][Trace][8959][it129s36] connect app, tcUrl=rtmp://localhost/live, pageUrl=, swfUrl=, schema=rtmp, vhost=localhost, port=1935, app=live, args=(obj)
    [2022-07-15 12:27:44.861][Trace][8959][it129s36] edge-srs ip=xxx, version=4.0.253, pid=2496, id=0
    [2022-07-15 12:27:44.861][Trace][8959][it129s36] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] client identified, type=flash-publish, vhost=localhost, app=live, stream=123, param=?, duration=0ms
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] connected stream, tcUrl=rtmp://localhost/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=123, param=?, args=(obj)
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] new source, stream_url=/live/123
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] source url=/live/123, ip=xxx, cache=1, is_edge=0, source_id=/
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] new source, stream_url=/live/123
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] RTC bridge from RTMP, rtmp2rtc=0, keep_bframe=0, merge_nalus=0
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] hls: win=60000ms, frag=10000ms, prefix=, path=./objs/nginx/html, m3u8=[app]/[stream].m3u8, ts=[app]/[stream]-[seq].ts, aof=2.00, floor=0, clean=1, waitk=1, dispose=0ms, dts_directly=1
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] ignore disabled exec for vhost=__defaultVhost__
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] http: mount flv stream for sid=/live/123, mount=/live/123.flv
    [2022-07-15 12:27:45.008][Trace][8959][it129s36] start publish mr=0/350, p1stpt=20000, pnt=5000, tcp_nodelay=0
    [2022-07-15 12:27:45.035][Trace][8959][it129s36] got metadata, width=1920, height=1080, vcodec=7
    [2022-07-15 12:27:45.035][Trace][8959][it129s36] 44B video sh,  codec(7, profile=Baseline, level=4, 1920x1080, 0kbps, 0.0fps, 0.0s)
    [2022-07-15 12:27:47.637][Trace][8959][622au47p] Hybrid cpu=3.00%,14MB, cid=3,2, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0, free=1, objs=(pkt:0,raw:0,fua:0,msg:40,oth:0,buf:0)
    [2022-07-15 12:27:52.637][Trace][8959][622au47p] Hybrid cpu=2.00%,21MB, cid=3,2, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0, free=1, objs=(pkt:0,raw:0,fua:0,msg:40,oth:0,buf:0)
    [2022-07-15 12:27:53.134][Trace][8959][vcc4u0t9] RTMP client ip=xxxx, fd=14
    [2022-07-15 12:27:53.203][Trace][8959][vcc4u0t9] complex handshake success
    [2022-07-15 12:27:53.229][Trace][8959][vcc4u0t9] connect app, tcUrl=rtmp://xxxxx/live, pageUrl=, swfUrl=, schema=rtmp, vhost=101.200.224.229, port=1935, app=live, args=null
    [2022-07-15 12:27:53.229][Trace][8959][vcc4u0t9] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
    [2022-07-15 12:27:53.394][Trace][8959][vcc4u0t9] ignore AMF0/AMF3 command message.
    [2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] ignore AMF0/AMF3 command message.
    [2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] client identified, type=rtmp-play, vhost=xxx, app=live, stream=livestream, param=, duration=-1ms
    [2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] connected stream, tcUrl=rtmp://xxxxx/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=livestream, param=, args=null
    [2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] new source, stream_url=/live/livestream
    [2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] source url=/live/livestream, ip=xxxx cache=1, is_edge=0, source_id=/
  3. SRS Config (配置):
    Make sure to maintain the markdown structure.

    
    # main config for srs.
    # @see full.conf for detail config.

listen 1935; max_connections 1000;

srs_log_tank file;

srs_log_file ./objs/server.log; daemon on; pid objs/server.pid; http_api { enabled on; listen 1985; } http_server { enabled on; listen 8080; dir ./objs/nginx/html; } rtc_server { enabled on; listen 8000; # UDP port

@see https://github.com/ossrs/srs/wiki/v4_CN_WebRTC#config-candidate

candidate $CANDIDATE;

} vhost defaultVhost { hls { enabled on; } http_remux { enabled on; mount [vhost]/[app]/[stream].flv; } rtc { enabled on;

@see https://github.com/ossrs/srs/wiki/v4_CN_WebRTC#rtmp-to-rtc

    rtmp_to_rtc off;
    # @see https://github.com/ossrs/srs/wiki/v4_CN_WebRTC#rtc-to-rtmp
    rtc_to_rtmp off;
}
cluster {
    mode            local;
    origin_cluster  on;
}

}


**Replay (重现)**  
Make sure to maintain the markdown structure.

> Please describe how to replay the bug? (重现Bug的步骤)  
Make sure to maintain the markdown structure.

1. Set up a simple edge SRS -> origin SRS -> play SRS configuration, push the stream from the edge and pull the stream from play SRS.
3. Use Python combined with FFmpeg to push the stream, with a relatively high video resolution.
4. Pull the stream from play SRS.

**Expect (Expected Behavior)**

> Please describe your expectation (描述你期望发生的事情)
Initially, the playback is lagging, and then the origin SRS crashes.
Using the visual performance of Mobaxterm, it is observed that the memory is exhausted in an instant.
When checking the core dump file, it shows.

Core was generated by `./objs/srs -c ./myconfig/server.conf'. Program terminated with signal SIGSEGV, Segmentation fault.

0 0x0000556fbfec8272 in std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > >::size (this=0x30)

at /usr/include/c++/7/bits/stl_vector.h:671

671 { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }

Locating the vector in the library file, it is speculated that the memory allocation is exhausted.
**Additional Information**
Python code for streaming.

import cv2 import subprocess

RTMP server address

rtmp = r'rtmp://localhost/live/123' # You can change the "123" to any other value, such as 123321/456, etc.

Read the video and get its properties

You can also replace the camera 0 with an RTSP address for RTSP streaming.

cap = cv2.VideoCapture('./video/303.mp4') size = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))) sizeStr = str(size[0]) + 'x' + str(size[1]) command = ['ffmpeg', '-y', '-an', '-f', 'rawvideo', '-vcodec', 'rawvideo', '-pix_fmt', 'bgr24', '-s', sizeStr, '-r', '25', '-i', '-', '-c:v', 'libx264', '-pix_fmt', 'yuv420p', '-preset', 'ultrafast', '-f', 'flv', rtmp] pipe = subprocess.Popen(command, shell=False, stdin=subprocess.PIPE ) while cap.isOpened(): success, frame = cap.read() if success: if cv2.waitKey(1) & 0xFF == ord('q'): break pipe.stdin.write(frame.tostring()) cap.release() pipe.terminate()



`TRANS_BY_GPT3`
winlinvip commented 2 years ago

What is the stack of coredump like? Can you execute bt to check?

How high is the video bitrate?

TRANS_BY_GPT3

WenzhiShao commented 2 years ago

Hello, the content of the stack frame is as follows. The term "high bitrate" was a mistake on my part. After checking, I found that it is a regular video with a resolution of 1920x1080, 25fps, and 4081kbps. It is just higher compared to the default source.flv file. When I experimented, I noticed that this issue does not occur when pushing the source.flv file.

#1  0x000055ec3249e585 in SrsConfig::get_vhost_coworkers (this=0x55ec33e6d300,
    vhost="__defaultVhost__") at src/app/srs_app_config.cpp:4955
#2  0x000055ec3243f381 in SrsRtmpConn::playing (this=0x55ec33ffd670, source=
    0x55ec34019660) at src/app/srs_app_rtmp_conn.cpp:615
#3  0x000055ec3243e6f0 in SrsRtmpConn::stream_service_cycle (this=0x55ec33ffd670)
    at src/app/srs_app_rtmp_conn.cpp:532
#4  0x000055ec3243d5d8 in SrsRtmpConn::service_cycle (this=0x55ec33ffd670)
    at src/app/srs_app_rtmp_conn.cpp:403
#5  0x000055ec3243c0d8 in SrsRtmpConn::do_cycle (this=0x55ec33ffd670)
    at src/app/srs_app_rtmp_conn.cpp:216
#6  0x000055ec32445180 in SrsRtmpConn::cycle (this=0x55ec33ffd670)
    at src/app/srs_app_rtmp_conn.cpp:1457
#7  0x000055ec3247481e in SrsFastCoroutine::cycle (this=0x55ec33ffd800)
    at src/app/srs_app_st.cpp:272
#8  0x000055ec324748ba in SrsFastCoroutine::pfn (arg=0x55ec33ffd800)
    at src/app/srs_app_st.cpp:287
#9  0x000055ec3258a969 in _st_thread_main () at sched.c:363
#10 0x000055ec3258b205 in st_thread_create (
    start=0x55ec3247489a <SrsFastCoroutine::pfn(void*)>, arg=0x55ec33ffd800, joinable=1,
    stk_size=65536) at sched.c:694

TRANS_BY_GPT3

winlinvip commented 2 years ago

You have enabled the origin cluster origin_cluster on;, but did not follow the configuration of the origin cluster, so there are issues.

Of course, there should not be a coredump here.

TRANS_BY_GPT3

winlinvip commented 2 years ago

Confirming that it is a configuration issue, enabling the origin server cluster origin_cluster on; without configuring cluster.coworkers will cause a crash, and it happens consistently.

vector<string> SrsConfig::get_vhost_coworkers(string vhost)
{
    vector<string> coworkers;

    SrsConfDirective* conf = get_vhost(vhost);
    if (!conf) {
        return coworkers;
    }

    conf = conf->get("cluster");
    if (!conf) {
        return coworkers;
    }

    conf = conf->get("coworkers");
    for (int i = 0; i < (int)conf->args.size(); i++) {

Because there is no check here.

This issue is essentially a configuration problem, not a common problem, so it will only be fixed in version 5.0.

TRANS_BY_GPT3