AirenSoft / OvenMediaEngine

OvenMediaEngine (OME) is a Sub-Second Latency Live Streaming Server with Large-Scale and High-Definition. #WebRTC #LLHLS
https://airensoft.com/ome.html
GNU Affero General Public License v3.0
2.54k stars 1.06k forks source link

Client connections get disconnected after ~60sec #806

Closed cloudcover2021 closed 2 years ago

cloudcover2021 commented 2 years ago

PROBLEM

Client connections get disconnected after ~60sec from Tester Server which is used for performance tuning.

  1. OME server is running on a GCE(Google Compute Instance) instance. All ports are open on this server
  2. The OME testing tool(https://airensoft.gitbook.io/ovenmediaengine/performance-tuning) is deployed on another instance(Google Compute Instance).
  3. We are publishing using the OvenStreamEncoder app and then starting the load test from the instance mentioned in step 2.
  4. Issue: a. After 60 secs the connection is disconnected from the testing server b. Following logs are observed on the OME var/log/ovenmediaengine/ovenmediaengine.log file
[2022-07-05 07:53:19.681] I [DQHTTPTimer:106990] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/109
[2022-07-05 07:53:19.681] I [DQHTTPTimer:106990] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(4)/Stream total(4)/App total(4)]
[2022-07-05 07:53:19.681] I [DQHTTPTimer:106990] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f55b006d100) : WebSocket <ClientSocket: 0x7f55b000f2a0, #40, Connected, TCP, Nonblocking, <TEST SERVER IP>:47316> TLS(Disabled) (#default#app / stream, ufrag: local: 8jZr9c, remote: UuXfEpflHtPaTGPY)
[2022-07-05 07:53:19.682] I [DQHTTPTimer:106990] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/107
[2022-07-05 07:53:19.682] I [DQHTTPTimer:106990] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(3)/Stream total(3)/App total(3)]
[2022-07-05 07:53:19.682] I [DQHTTPTimer:106990] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f55b000af90) : WebSocket <ClientSocket: 0x7f55b0001310, #36, Connected, TCP, Nonblocking, <TEST SERVER IP>:47312> TLS(Disabled) (#default#app / stream, ufrag: local: IFS78g, remote: ECEYHWbzApqJKrZp)
[2022-07-05 07:53:19.682] I [DQHTTPTimer:106990] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/106
[2022-07-05 07:53:19.683] I [DQHTTPTimer:106990] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(2)/Stream total(2)/App total(2)]
[2022-07-05 07:53:19.683] I [DQHTTPTimer:106990] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f55b0057480) : WebSocket <ClientSocket: 0x7f55800169f0, #34, Connected, TCP, Nonblocking, <TEST SERVER IP>:47310> TLS(Disabled) (#default#app / stream, ufrag: local: mWl6IT, remote: PbjrArjDaEJFlbfX)
[2022-07-05 07:53:19.683] I [DQHTTPTimer:106990] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/108
[2022-07-05 07:53:19.684] I [DQHTTPTimer:106990] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(1)/Stream total(1)/App total(1)]
[2022-07-05 07:53:19.684] I [DQHTTPTimer:106990] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f55b0002e10) : WebSocket <ClientSocket: 0x7f55b006b9f0, #38, Connected, TCP, Nonblocking, <TEST SERVER IP>:47314> TLS(Disabled) (#default#app / stream, ufrag: local: EwjOQs, remote: VXXwogzlPMADozJY)
[2022-07-05 07:53:19.684] I [DQHTTPTimer:106990] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/105
[2022-07-05 07:53:19.684] I [DQHTTPTimer:106990] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(0)/Stream total(0)/App total(0)]
[2022-07-05 07:53:19.684] I [DQHTTPTimer:106990] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f55b0034010) : WebSocket <ClientSocket: 0x7f55b0033dd0, #32, Connected, TCP, Nonblocking, <TEST SERVER IP>:47308> TLS(Disabled) (#default#app / stream, ufrag: local: njVeZx, remote: dCXIzcoNxvgkyrJC)
[2022-07-05 07:53:19.726] I [SPICE-T3478:107000] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f5510006550, #39, Closed, TCP, Nonblocking, <TEST SERVER IP>:44680>
[2022-07-05 07:53:19.726] I [SPICE-T3478:107000] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f54f00080a0, #33, Closed, TCP, Nonblocking, <TEST SERVER IP>:44674>
[2022-07-05 07:53:19.726] I [SPICE-T3478:107000] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f5580013a00, #35, Closed, TCP, Nonblocking, <TEST SERVER IP>:44676>
[2022-07-05 07:53:19.726] I [SPICE-T3478:107000] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f5510003300, #37, Closed, TCP, Nonblocking, <TEST SERVER IP>:44678>
[2022-07-05 07:53:19.727] I [SPICE-T3478:107000] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f55800131d0, #41, Closed, TCP, Nonblocking, <TEST SERVER IP>:44684>

ENVIRONMENTAL INFORMATION

Version used 0.12.4. Deployed on GCE(Google Compute Engine)

OvenStreamEncoder is used for streaming video

Please tell me which address you used to send and play. You can hide IP or Domain.

Ubuntu 20.04

dimiden commented 2 years ago

@cloudcover2021 Thank you for your detailed bug reporting, but I think you are using old version of OME. If there is no specific reason, could you use the latest OME? A ton of bugs have been fixed in latest OME, and it is highly likely that the bugs you mentioned above have already been fixed. If the same symptoms occur with the latest version, please tell us again.

cloudcover2021 commented 2 years ago

@dimiden By latest you mean this one: https://airensoft.gitbook.io/ovenmediaengine/v/master-1/

cloudcover2021 commented 2 years ago

@dimiden I installed the version mentioned above but still same issue. No changes made to configurations. PFB the OME logs:

root@ovenmediaserver-1:/home/sagarpatil/OME/OvenMediaEngine-master/src# cd /var/log/ovenmediaengine/
root@ovenmediaserver-1:/var/log/ovenmediaengine# tail -f ovenmediaengine.log
[2022-07-05 17:16:57.517] I [DQHTTPTimer:104898] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/104
[2022-07-05 17:16:57.518] I [DQHTTPTimer:104898] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(4)/Stream total(4)/App total(4)]
[2022-07-05 17:16:57.518] I [DQHTTPTimer:104898] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f12b0041cc0) : WebSocket <ClientSocket: 0x7f12b006bda0, #40, Connected, TCP, Nonblocking, <TEST SERVER IP>:62625> TLS(Disabled) (#default#app / stream, ufrag: local: m2lePK, remote: FlhOhFLYHSlBhtdO)
[2022-07-05 17:16:57.519] I [DQHTTPTimer:104898] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/103
[2022-07-05 17:16:57.519] I [DQHTTPTimer:104898] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(3)/Stream total(3)/App total(3)]
[2022-07-05 17:16:57.519] I [DQHTTPTimer:104898] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f12b00452d0) : WebSocket <ClientSocket: 0x7f12b003c720, #38, Connected, TCP, Nonblocking, <TEST SERVER IP>:62622> TLS(Disabled) (#default#app / stream, ufrag: local: qAK5dT, remote: QJufBKAtAkBPacyl)
[2022-07-05 17:16:57.520] I [DQHTTPTimer:104898] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/101
[2022-07-05 17:16:57.520] I [DQHTTPTimer:104898] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(2)/Stream total(2)/App total(2)]
[2022-07-05 17:16:57.520] I [DQHTTPTimer:104898] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f12b003cf30) : WebSocket <ClientSocket: 0x7f12b00152f0, #34, Connected, TCP, Nonblocking, <TEST SERVER IP>:62616> TLS(Disabled) (#default#app / stream, ufrag: local: RzOeqQ, remote: EVLJPjufFsjFoUfV)
[2022-07-05 17:16:57.520] I [DQHTTPTimer:104898] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/102
[2022-07-05 17:16:57.521] I [DQHTTPTimer:104898] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(1)/Stream total(1)/App total(1)]
[2022-07-05 17:16:57.521] I [DQHTTPTimer:104898] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f12b0047500) : WebSocket <ClientSocket: 0x7f12b004f310, #36, Connected, TCP, Nonblocking, <TEST SERVER IP>:62619> TLS(Disabled) (#default#app / stream, ufrag: local: xgauVR, remote: ClSfVpfEruQceums)
[2022-07-05 17:16:57.521] I [DQHTTPTimer:104898] WebRTC Publisher | webrtc_publisher.cpp:701  | Stop command received : #default#app/stream/100
[2022-07-05 17:16:57.522] I [DQHTTPTimer:104898] Monitor | stream_metrics.cpp:136  | A session has been stopped playing #default#app/stream on the WebRTC publisher. Concurrent Viewers[WebRTC(0)/Stream total(0)/App total(0)]
[2022-07-05 17:16:57.522] I [DQHTTPTimer:104898] Signalling | rtc_signalling_server.cpp:365  | Client is disconnected: HttpConnection(0x7f12b0002090) : WebSocket <ClientSocket: 0x7f12b0001650, #32, Connected, TCP, Nonblocking, <TEST SERVER IP>:62613> TLS(Disabled) (#default#app / stream, ufrag: local: Qi0euj, remote: vOQdsNeLAqLNYZDq)
[2022-07-05 17:16:57.528] I [SPICE-T3478:104908] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f128c00d390, #39, Closed, TCP, Nonblocking, <TEST SERVER IP>:62624>
[2022-07-05 17:16:57.528] I [SPICE-T3478:104908] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f128c00f040, #37, Closed, TCP, Nonblocking, <TEST SERVER IP>:62621>
[2022-07-05 17:16:57.528] I [SPICE-T3478:104908] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f128c0096c0, #35, Closed, TCP, Nonblocking, <TEST SERVER IP>:62618>
[2022-07-05 17:16:57.529] I [SPICE-T3478:104908] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f128c008250, #33, Closed, TCP, Nonblocking, <TEST SERVER IP>:62615>
[2022-07-05 17:16:57.529] I [SPICE-T3478:104908] ICE | ice_port.cpp:493  | Turn client has disconnected : <ClientSocket: 0x7f128c011080, #41, Closed, TCP, Nonblocking, <TEST SERVER IP>:62626>

** Also I checked the status of the OME service on ubuntu VM after the load test failure and the following was its status. Error highlighted in BOLD:

● ovenmediaengine.service - OvenMediaEngine
     Loaded: loaded (/lib/systemd/system/ovenmediaengine.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-07-05 17:12:27 UTC; 6min ago
   Main PID: 104893 (OvenMediaEngine)
      Tasks: 58 (limit: 4697)
     Memory: 50.5M
     CGroup: /system.slice/ovenmediaengine.service
             └─104893 /usr/bin/OvenMediaEngine -d

Jul 05 17:12:27 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:12:27.513] W [OvenMediaEngine:104893] HLS | hls_publisher.cpp:38   | HLS Publisher is disabled by configuration
Jul 05 17:12:27 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:12:27.514] W [OvenMediaEngine:104893] DASH | dash_publisher.cpp:36   | DASH Publisher is disabled by configuration
Jul 05 17:12:27 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:12:27.514] W [OvenMediaEngine:104893] LLDASH | cmaf_publisher.cpp:39   | LLDASH Publisher is disabled by configuration
Jul 05 17:12:27 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:12:27.517] W [OvenMediaEngine:104893] Thumbnail | thumbnail_publisher.cpp:45   | ThumbnailPublisher is disabled by configuration
Jul 05 17:12:27 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:12:27.533] W [OvenMediaEngine:104893] Publisher | publisher.cpp:69   | File publisher is disabled in #default#app application, so it was not created
Jul 05 17:12:27 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:12:27.533] W [OvenMediaEngine:104893] Publisher | publisher.cpp:69   | MPEGTSPush publisher is disabled in #default#app application, so it was not created
Jul 05 17:12:27 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:12:27.533] W [OvenMediaEngine:104893] Publisher | publisher.cpp:69   | RTMPPush publisher is disabled in #default#app application, so it was not created
Jul 05 17:12:27 ovenmediaserver-1 systemd[1]: Started OvenMediaEngine.
*****
Jul 05 17:15:48 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:15:48.772] E [Decaac:106041] FFmpeg | third_parties.cpp:111  | [AVCodecContext: 0x7f127401ea00] Input buffer exhausted before END element found
Jul 05 17:15:48 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:15:48.772] E [Decaac:106041] Transcoder | decoder_aac.cpp:166  | An error occurred while sending a packet for decoding: Unhandled error (-1094995529:Invalid data found when processing input)
*****
cloudcover2021 commented 2 years ago

Is there any configuration to increase the timeout from 60 sec to a higher value.

cloudcover2021 commented 2 years ago

I changed from the OvenStreamEncoder app to OBS(https://obsproject.com/) to start the stream. Same error but this time the status of the OME service on ubuntu VM after the load test failure was below:

● ovenmediaengine.service - OvenMediaEngine
     Loaded: loaded (/lib/systemd/system/ovenmediaengine.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-07-05 17:32:49 UTC; 12min ago
    Process: 504 ExecStart=/usr/bin/OvenMediaEngine -d (code=exited, status=0/SUCCESS)
   Main PID: 520 (OvenMediaEngine)
      Tasks: 58 (limit: 4697)
     Memory: 118.6M
     CGroup: /system.slice/ovenmediaengine.service
             └─520 /usr/bin/OvenMediaEngine -d

Jul 05 17:32:48 ovenmediaserver-1 systemd[1]: Starting OvenMediaEngine...
Jul 05 17:32:49 ovenmediaserver-1 OvenMediaEngine[520]: [2022-07-05 17:32:49.109] W [OvenMediaEngine:520] HLS | hls_publisher.cpp:38   | HLS Publisher is disabled by configuration
Jul 05 17:32:49 ovenmediaserver-1 OvenMediaEngine[520]: [2022-07-05 17:32:49.111] W [OvenMediaEngine:520] DASH | dash_publisher.cpp:36   | DASH Publisher is disabled by configuration
Jul 05 17:32:49 ovenmediaserver-1 OvenMediaEngine[520]: [2022-07-05 17:32:49.111] W [OvenMediaEngine:520] LLDASH | cmaf_publisher.cpp:39   | LLDASH Publisher is disabled by configuration
Jul 05 17:32:49 ovenmediaserver-1 OvenMediaEngine[520]: [2022-07-05 17:32:49.134] W [OvenMediaEngine:520] Thumbnail | thumbnail_publisher.cpp:45   | ThumbnailPublisher is disabled by configuration
Jul 05 17:32:49 ovenmediaserver-1 OvenMediaEngine[520]: [2022-07-05 17:32:49.230] W [OvenMediaEngine:520] Publisher | publisher.cpp:69   | File publisher is disabled in #default#app application, so it was not created
Jul 05 17:32:49 ovenmediaserver-1 OvenMediaEngine[520]: [2022-07-05 17:32:49.230] W [OvenMediaEngine:520] Publisher | publisher.cpp:69   | MPEGTSPush publisher is disabled in #default#app application, so it was not created
Jul 05 17:32:49 ovenmediaserver-1 OvenMediaEngine[520]: [2022-07-05 17:32:49.230] W [OvenMediaEngine:520] Publisher | publisher.cpp:69   | RTMPPush publisher is disabled in #default#app application, so it was not created
Jul 05 17:32:49 ovenmediaserver-1 systemd[1]: Started OvenMediaEngine.

No error this time

dimiden commented 2 years ago

@cloudcover2021

By latest you mean this one: https://airensoft.gitbook.io/ovenmediaengine/v/master-1/

Yes. The version I mentioned is master branch or v0.14.2.

Also I checked the status of the OME service on ubuntu VM after the load test failure and the following was its status. Error highlighted in BOLD

Decoding issues related to OvenStreamEncoder have been assigned to the person in charge. (However, this is not a reason.)

Is there any configuration to increase the timeout from 60 sec to a higher value.

Adjusting timeout is not recommended because increasing the timeout will only cause the problem to appear later.

This is a complex issue, so the exact cause should be analyzed by looking at the system/full log/settings/etc. Please attach the following additional information.

P.S. There was a timeout issue when we tested it before, which was a symptom of packet transmission being slowed down little by little at the point of a certain number of times, and many of the connected clients being disconnected as the slow speed accumulated.

dimiden commented 2 years ago

@cloudcover2021 Oh! We have confirmed that this problem is reproduced. I will tell you again when the contents are updated.

getroot commented 2 years ago

@cloudcover2021

Thanks for reporting this issue. It was a websocket stuck issue in OvenRtcTester, I fixed this issue and committed it.

https://github.com/AirenSoft/OvenMediaEngine/commit/4fb4af2a646f053886908d4cd754756d9fb29bf9

cloudcover2021 commented 2 years ago

Issue fixed. Thanks

Keukhan commented 2 years ago

@cloudcover2021

While trying to figure out the cause, I found a problem with the OvenStreamEncoder app and solved it. The cause of the error in AAC Decoder when using OvenStreamEncoder is that OvenStreamEncoder sends an invalid audio packet.
I need to update the app in the google store.

Thank you so much.

Jul 05 17:15:48 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:15:48.772] E [Decaac:106041] FFmpeg | third_parties.cpp:111 | [AVCodecContext: 0x7f127401ea00] Input buffer exhausted before END element found Jul 05 17:15:48 ovenmediaserver-1 OvenMediaEngine[104893]: [2022-07-05 17:15:48.772] E [Decaac:106041] Transcoder | decoder_aac.cpp:166 | An error occurred while sending a packet for decoding: Unhandled error (-1094995529:Invalid data found when processing input)