Closed linkewei0580 closed 1 year ago
FFmpeg processes an RTSP stream from a camera. However, when a user disconnects the camera's network cable, FFmpeg may become unresponsive or freeze, potentially causing the HTTP callback to malfunction.
However, it appears that this should not have happened, as FFmpeg is an isolated process that should not obstruct or influence the HTTP callback of SRS.
It is necessary to conduct some research and replicate this issue.
FFmpeg pulls the RTSP stream from the camera and then pushes it to SRS. However, when the user disconnects the camera's network cable, FFmpeg may become unresponsive, which may cause the HTTP callback to fail.
However, this should not happen because FFmpeg is an independent process and should not affect SRS's HTTP callback.
It will take time to reproduce this issue.
I confirm witnessing this bug as well, here are my logs
[2023-07-08 13:37:10.024][Warn][212][6692248x][104] client disconnect peer. ret=1008
[2023-07-08 13:37:10.024][Trace][212][65891n6r] TCP: clear zombies=1 resources, conns=4, removing=0, unsubs=0
[2023-07-08 13:37:10.024][Trace][212][6692248x] TCP: disposing #0 resource(RtmpConn)(0x555e1df0ad80), conns=4, disposing=1, zombies=0
[2023-07-08 13:37:11.081][Trace][212][66y15z81] TCP: before dispose resource(RtmpConn)(0x555e1decde50), conns=3, zombies=0, ign=0, inz=0, ind=0
[2023-07-08 13:37:11.081][Error][212][66y15z81][62] serve error code=1011 : service cycle : rtmp: stream service : rtmp: callback on publish : rtmp on_publish http://127.0.0.1:8085/api/v1/streams/publish : http: on_publish failed, client_id=66y15z81, url=http://127.0.0.1:8085/api/v1/streams/publish, request={"server_id":"vid-gk40023","action":"on_publish","client_id":"66y15z81","ip":"188.70.45.172","vhost":"__defaultVhost__","app":"live","tcUrl":"rtmp://live.eyon.tv/live","stream":"oyo1QizEf-gGc_Yj4IYz","param":""}, response=, code=0 : http: client post : http: parse response : parse message : grow buffer : read bytes : timeout 30000 ms
thread [212][66y15z81]: do_cycle() [src/app/srs_app_rtmp_conn.cpp:217][errno=62]
thread [212][66y15z81]: service_cycle() [src/app/srs_app_rtmp_conn.cpp:414][errno=62]
thread [212][66y15z81]: publishing() [src/app/srs_app_rtmp_conn.cpp:830][errno=62]
thread [212][66y15z81]: http_hooks_on_publish() [src/app/srs_app_rtmp_conn.cpp:1338][errno=62]
thread [212][66y15z81]: on_publish() [src/app/srs_app_http_hooks.cpp:147][errno=62]
thread [212][66y15z81]: do_post() [src/app/srs_app_http_hooks.cpp:505][errno=62]
thread [212][66y15z81]: post() [src/protocol/srs_service_http_client.cpp:349][errno=62]
thread [212][66y15z81]: parse_message() [src/protocol/srs_service_http_conn.cpp:100][errno=62]
thread [212][66y15z81]: parse_message_imp() [src/protocol/srs_service_http_conn.cpp:163][errno=62]
thread [212][66y15z81]: grow() [src/protocol/srs_protocol_stream.cpp:162][errno=62]
thread [212][66y15z81]: read() [src/protocol/srs_service_st.cpp:507][errno=62](Timer expired)
going to investigate on my end, but it seems to be a race condition or something on the hooks
OK upon further investigation, the on_publish on my end has an authentication logic that calls a server to authenticate, and if authentication fails then I reject the stream.
The problem, it turns out, is that some streaming software continually retries, and that continuous retry is causing a memory leak and an eventual blowup. So we can solve the problem at the firewall level, but it would be great to identify and solve the problem at the server level of SRS
Description
By using ingest to obtain RTSP streaming and http_hooks callback, when the camera's network cable is plugged and unplugged, it cannot callback normally, the streaming fails. By changing the http_hooks enabled to off and reloading, the streaming can be restored normally.
SRS Version: XCORE-SRS/5.0.152(Bee)
SRS Log:
ffmpeg log
SRS Config:
Replay
Please describe how to replay the bug?
Expect
When the camera's network cable is plugged and unplugged, it can callback normally and the streaming is successful.