HLS: Crash when publishing and reloading simultenously

Note: Please read FAQ before file an issue, see #2716

Description

Please description your issue here

SRS Version: 5.0.170
SRS Log:

[2023-08-19 11:36:36.818][INFO][1][cy9z6u71] new live source, stream_url=/live/stream-21430-8489276324571820827
[2023-08-19 11:36:36.818][INFO][1][cy9z6u71] source url=/live/stream-21430-8489276324571820827, ip=172.17.0.1, cache=1/2500, is_edge=0, source_id=/
[2023-08-19 11:36:36.831][INFO][1][cy9z6u71] http: on_publish ok, client_id=cy9z6u71, url=http://host.docker.internal:2024/terraform/v1/hooks/srs/verify, request={"server_id":"vid-ca06551","service_id":"2i2333g3","action":"on_publish","client_id":"cy9z6u71","ip":"172.17.0.1","vhost":"__defaultVhost__","app":"live","tcUrl":"rtmp://localhost:1935/live","stream":"stream-21430-8489276324571820827","param":"?secret=61ea7c72e9204cb0bbf83e385fbaf9b2","stream_url":"/live/stream-21430-8489276324571820827","stream_id":"vid-q771s27"}, response={"code":0,"data":null,"server":88474}
[2023-08-19 11:36:36.831][INFO][1][cy9z6u71] new rtc source, stream_url=/live/stream-21430-8489276324571820827
[2023-08-19 11:36:36.831][INFO][1][cy9z6u71] RTC bridge from RTMP, rtmp2rtc=1, keep_bframe=0, merge_nalus=0
[2023-08-19 11:36:36.840][INFO][1][cy9z6u71] hls: win=60000ms, frag=10000ms, prefix=, path=./objs/nginx/html, m3u8=[app]/[stream].m3u8, ts=[app]/[stream]-[seq].ts, aof=2.00, floor=0, clean=1, waitk=1, dispose=10000ms, dts_directly=1
[2023-08-19 11:36:36.840][INFO][1][cy9z6u71] ignore disabled exec for vhost=__defaultVhost__
[2023-08-19 11:36:36.841][INFO][1][cy9z6u71] http: mount flv stream for sid=/live/stream-21430-8489276324571820827, mount=/live/stream-21430-8489276324571820827.flv
[2023-08-19 11:36:36.841][INFO][1][cy9z6u71] start publish mr=0/350, p1stpt=20000, pnt=5000, tcp_nodelay=0
[2023-08-19 11:36:36.863][INFO][1][cy9z6u71] got metadata, width=1280, height=720, vcodec=7, acodec=10
[2023-08-19 11:36:36.863][INFO][1][cy9z6u71] 45B video sh,  codec(7, profile=Baseline, level=3.1, 1280x720, 0kbps, 0.0fps, 0.0s)
[2023-08-19 11:36:36.863][INFO][1][cy9z6u71] 7B audio sh, codec(10, profile=LC, 2channels, 0kbps, 44100HZ), flv(16bits, 2channels, 44100HZ)
[2023-08-19 11:36:36.864][INFO][1][cy9z6u71] RTMP2RTC: Init audio codec to 10(AAC)
[2023-08-19 11:36:37.612][INFO][1][48i82096] config parse include containers/data/config/srs.server.conf
[2023-08-19 11:36:37.619][INFO][1][48i82096] config parse complete
[2023-08-19 11:36:37.619][INFO][1][48i82096] config parse include containers/data/config/srs.vhost.conf
[2023-08-19 11:36:37.625][INFO][1][48i82096] config parse complete
[2023-08-19 11:36:37.626][INFO][1][48i82096] config parse complete
[2023-08-19 11:36:37.626][INFO][1][48i82096] srs checking config...
[2023-08-19 11:36:37.626][WARN][1][48i82096][11] stats network use index=0, ip=172.17.0.3, ifname=eth0
[2023-08-19 11:36:37.626][WARN][1][48i82096][11] stats disk not configed, disk iops disabled.
[2023-08-19 11:36:37.626][INFO][1][48i82096] write log to console
[2023-08-19 11:36:37.627][INFO][1][48i82096] reload rtc server success, nothing changed.
[2023-08-19 11:36:37.627][INFO][1][48i82096] vhost __defaultVhost__ maybe modified, reload its detail.
[2023-08-19 11:36:37.649][INFO][1][mo5u8823] HLS: Switch audio codec 16(Other) to 10(AAC)

[2023-08-19 11:36:37.659][ERROR][1][mo5u8823][0] backtrace 18 frames of ./objs/srs SRS/5.0.170(Bee)
AddressSanitizer:DEADLYSIGNAL
=================================================================
==1==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f8e0289e941 bp 0x7f8e02a34588 sp 0x7f8dfc2e1400 T1)
==1==The signal is caused by a READ memory access.
==1==Hint: address points to the zero page.
    #0 0x7f8e0289e940 in abort (/lib/x86_64-linux-gnu/libc.so.6+0x22940)
    #1 0x7f8e0289e728  (/lib/x86_64-linux-gnu/libc.so.6+0x22728)
    #2 0x7f8e028affd5 in __assert_fail (/lib/x86_64-linux-gnu/libc.so.6+0x33fd5)
    #3 0x55e98f7aa811 in SrsCplxError::srs_assert(bool) src/kernel/srs_kernel_error.cpp:446
    #4 0x55e98fa4ec89 in SrsHlsMuxer::is_segment_absolutely_overflow() src/app/srs_app_hls.cpp:535
    #5 0x55e98fa58150 in SrsHlsController::write_audio(SrsAudioFrame*, long) src/app/srs_app_hls.cpp:1008
    #6 0x55e98fa5b772 in SrsHls::on_audio(SrsSharedPtrMessage*, SrsFormat*) src/app/srs_app_hls.cpp:1315
    #7 0x55e98fa1adfe in SrsOriginHub::on_audio(SrsSharedPtrMessage*) src/app/srs_app_source.cpp:966
    #8 0x55e98fa2e82d in SrsLiveSource::on_audio_imp(SrsSharedPtrMessage*) src/app/srs_app_source.cpp:2320
    #9 0x55e98fa2db3f in SrsLiveSource::on_audio(SrsCommonMessage*) src/app/srs_app_source.cpp:2269
    #10 0x55e98fa0a913 in SrsRtmpConn::process_publish_message(SrsLiveSource*, SrsCommonMessage*) src/app/srs_app_rtmp_conn.cpp:1187
    #11 0x55e98fa0a5a0 in SrsRtmpConn::handle_publish_message(SrsLiveSource*, SrsCommonMessage*) src/app/srs_app_rtmp_conn.cpp:1166
    #12 0x55e98fc1fd53 in SrsPublishRecvThread::consume(SrsCommonMessage*) src/app/srs_app_recv_thread.cpp:373
    #13 0x55e98fc1d5b9 in SrsRecvThread::do_cycle() src/app/srs_app_recv_thread.cpp:131
    #14 0x55e98fc1d03a in SrsRecvThread::cycle() src/app/srs_app_recv_thread.cpp:100
    #15 0x55e98fa80e93 in SrsFastCoroutine::cycle() src/app/srs_app_st.cpp:285
    #16 0x55e98fa80fe3 in SrsFastCoroutine::pfn(void*) src/app/srs_app_st.cpp:300
    #17 0x55e98fe320c9 in _st_thread_main /srs/trunk/objs/Platform-SRS5-Linux-5.15.0-GCC9.4.0-x86_64/st-srs/sched.c:380
    #18 0x55e98fe329ef in st_thread_create /srs/trunk/objs/Platform-SRS5-Linux-5.15.0-GCC9.4.0-x86_64/st-srs/sched.c:666
    #19 0x55e990230647  (/usr/local/srs/objs/srs+0xf78647)

SRS Config:

vhost __defaultVhost__ {
    hls {
        enabled on;
        hls_ctx off;
    }

    http_hooks {
        enabled         on;
        on_hls          http://127.0.0.1:2024/terraform/v1/hooks/srs/hls;
    }
}

Replay

Please describe how to replay the bug?

Step 1: Run SRS, with HLS and on_hls callback.

Step 2: Set hls_ctx on; and publish stream.

Step 3: Set the field hls_ctx off; and reload SRS.

Crashed.

In on_hls, sleep 60s to simulate slow API server.

srs_error_t SrsHttpHooks::on_hls(SrsContextId c, string url, SrsRequest* req, string file, string ts_url, string m3u8, string m3u8_url, int sn, srs_utime_t duration) {
    ......
    srs_usleep(60 * SRS_UTIME_SECONDS);

    SrsHttpClient http;
    if ((err = do_post(&http, url, data, status_code, res)) != srs_success) {
        return srs_error_wrap(err, "http: post %s with %s, status=%d, res=%s", url.c_str(), data.c_str(), status_code, res.c_str());
    }
    ......

Expect

No crash.

My reproduction steps:

SRS configuration: srs.conf.ctx_on (hls_ctx is enabled by default)

listen              1935;
max_connections     1000;
srs_log_tank        file;
srs_log_file        ./objs/srs.log;
daemon              on;
http_api {
enabled         on;
listen          1985;
raw_api {
    enabled on;
    allow_reload on;
}
}
http_server {
enabled         on;
listen          8080;
dir             ./objs/nginx/html;
}
rtc_server {
enabled on;
listen 8000; # UDP port
# @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#config-candidate
candidate $CANDIDATE;
}
vhost __defaultVhost__ {
http_hooks {
    enabled on;
    on_hls http://127.0.0.1:8085/api/v1/hls;
}
hls {
    enabled         on;
}
http_remux {
    enabled     on;
    mount       [vhost]/[app]/[stream].flv;
}
rtc {
    enabled     on;
    # @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#rtmp-to-rtc
    rtmp_to_rtc on;
    # @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#rtc-to-rtmp
    rtc_to_rtmp on;
}

play{
    gop_cache_max_frames 2500;
}
}

srs.conf.ctx_off

listen              1935;
max_connections     1000;
srs_log_tank        file;
srs_log_file        ./objs/srs.log;
daemon              on;
http_api {
enabled         on;
listen          1985;
raw_api {
    enabled on;
    allow_reload on;
}
}
http_server {
enabled         on;
listen          8080;
dir             ./objs/nginx/html;
}
rtc_server {
enabled on;
listen 8000; # UDP port
# @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#config-candidate
candidate $CANDIDATE;
}
vhost __defaultVhost__ {
http_hooks {
    enabled on;
    on_hls http://127.0.0.1:8085/api/v1/hls;
}
hls {
    enabled         on;
    hls_ctx          off;
    hls_ts_ctx    off;
}
http_remux {
    enabled     on;
    mount       [vhost]/[app]/[stream].flv;
}
rtc {
    enabled     on;
    # @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#rtmp-to-rtc
    rtmp_to_rtc on;
    # @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#rtc-to-rtmp
    rtc_to_rtmp on;
}

play{
    gop_cache_max_frames 2500;
}
}

Start srs
```
./objs/srs -c conf/srs.conf
```

Start backend server

Wait for 10s in /api/v1/hls, such as

// handle the dvrs requests: on_hls stream.
http.HandleFunc("/api/v1/hls", func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
    SrsWriteDataResponse(w, struct{}{})
    return
}

if err := func() error {
    body, err := ioutil.ReadAll(r.Body)
    if err != nil {
        return fmt.Errorf("read request body, err %v", err)
    }
    log.Println(fmt.Sprintf("post to hls, req=%v", string(body)))

    msg := &SrsHlsRequest{}
    if err := json.Unmarshal(body, msg); err != nil {
        return fmt.Errorf("parse message from %v, err %v", string(body), err)
    }
    log.Println(fmt.Sprintf("Got %v", msg.String()))

    time.Sleep(10 * time.Second)

    log.Println("after sleep")

    if !msg.IsOnHls() {
        return fmt.Errorf("invalid message %v", msg.String())
    }

    SrsWriteDataResponse(w, &SrsCommonResponse{Code: 0})
    return nil
}(); err != nil {
    SrsWriteErrorResponse(w, err)
}
})

Start server
```
cd research/api-server
go run server.go
```

Push stream

ffmpeg -stream_loop -1 -re -i doc/source.200kbps.768x320.flv -c copy -f flv "rtmp://127.0.0.1/live/livestream"

Loop reload
```
#!/bin/bash
```

i=1

cur_pid=$(pgrep "srs")

while [ $i -le 1000000 ] do let i++ echo "$i"

sleep 10

cp conf/srs.conf.ctx_on conf/srs.conf kill -1 $cur_pid

pid=$(pgrep "srs") echo $(date +%T)" reload signo, srs pid=$pid"

sleep 10

cp conf/srs.conf.ctx_off conf/srs.conf kill -1 $cur_pid

pid=$(pgrep "srs") echo $(date +%T)" reload signo, srs pid=$pid" done


5. Wait for SRS crash

`TRANS_BY_GPT4`

ossrs / srs