ossrs / srs

SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.
https://ossrs.io
MIT License
24.86k stars 5.29k forks source link

HLS: Fix on_hls and hls_dispose critical zone issue. v5.0.174 v6.0.69 #3781

Closed winlinvip closed 10 months ago

winlinvip commented 10 months ago

on_hls and hls_dispose are two coroutines, with potential race conditions. That is, during on_hls, if the API Server being accessed is slower, it will switch to the hls_dispose coroutine to start cleaning up. However, when the API Server is processing the slice, a situation may occur where the slice does not exist, resulting in the following log:

[2023-08-22 12:03:20.309][WARN][40][x5l48q7b][11] ignore task failed code=4005(HttpStatus)(Invalid HTTP status code) : callback on_hls http://localhost:2024/terraform/v1/hooks/srs/hls : http: post http://localhost:2024/terraform/v1/hooks/srs/hls with {"server_id":"vid-5d7dxn8","service_id":"cu153o7g","action":"on_hls","client_id":"x5l48q7b","ip":"172.17.0.1","vhost":"__defaultVhost__","app":"live","tcUrl":"srt://172.17.0.2/live","stream":"stream-44572-2739617660809856576","param":"secret=1ed8e0ffbc53439c8fc8da30ab8c19f0","duration":4.57,"cwd":"/usr/local/srs-stack/platform","file":"./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts","url":"live/stream-44572-2739617660809856576-1.ts","m3u8":"./objs/nginx/html/live/stream-44572-2739617660809856576.m3u8","m3u8_url":"live/stream-44572-2739617660809856576.m3u8","seq_no":1,"stream_url":"/live/stream-44572-2739617660809856576","stream_id":"vid-0n9zoz3"}, status=500, res=invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
thread [40][x5l48q7b]: call() [./src/app/srs_app_hls.cpp:122][errno=11]
thread [40][x5l48q7b]: on_hls() [./src/app/srs_app_http_hooks.cpp:401][errno=11]
thread [40][x5l48q7b]: do_post() [./src/app/srs_app_http_hooks.cpp:638][errno=11]

[error] 2023/08/22 12:03:20.076984 [52][1001] Serve /terraform/v1/hooks/srs/hls failed, err is stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts
main.handleOnHls.func1.1
    /g/platform/srs-hooks.go:684
main.handleOnHls.func1
    /g/platform/srs-hooks.go:720
net/http.HandlerFunc.ServeHTTP
    /usr/local/go/src/net/http/server.go:2084
net/http.(*ServeMux).ServeHTTP
    /usr/local/go/src/net/http/server.go:2462
net/http.serverHandler.ServeHTTP
    /usr/local/go/src/net/http/server.go:2916
net/http.(*conn).serve
    /usr/local/go/src/net/http/server.go:1966
runtime.goexit
    /usr/local/go/src/runtime/asm_amd64.s:1571

Similarly, when stopping the stream, on_hls will also be called to handle the last slice. If the API Server is slower at this time, it will enter hls_dispose and call unpublish repeatedly. Since the previous unpublish is still blocked in on_hls, the following interference log will appear:

[2023-08-22 12:03:18.748][INFO][40][6498088c] hls cycle to dispose hls /live/stream-44572-2739617660809856576, timeout=10000000ms
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] flush audio ignored, for segment is not open.
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] ignore the segment close, for segment is not open.

Although this log will not cause problems, it can interfere with judgment.

The solution is to add an 'unpublishing' status. If it is in the 'unpublishing' status, then do not clean up the slices.


TRANS_BY_GPT4


Co-authored-by: Haibo Chen 495810242@qq.com