bluenviron / mediamtx

Ready-to-use SRT / WebRTC / RTSP / RTMP / LL-HLS media server and media proxy that allows to read, publish, proxy, record and playback video and audio streams.
MIT License
10.88k stars 1.41k forks source link

Leak of goroutines with WHEP sources #3118

Closed RouquinBlanc closed 4 months ago

RouquinBlanc commented 4 months ago

Which version are you using?

Issue seen on v1.6.0 and on master after the merge of #3110 . Issue not present AFAICS on v1.5.1

Which operating system are you using?

Describe the issue

The configuration contains a mix of WHEP sources (less than 10), with 3 connecting successfully (the remote endpoints are up) and the others not.

The number of goroutines related to pion/webrtc very quickly raise to thousands (reached 11K in less than 24 hours) - see attached goroutines from pprof.

Describe how to replicate the issue

  1. start the server with some WHEP paths pointing to nothing. It should start leaking quickly

Did you attach the server logs?

goroutines.txt

yes, goroutines

Did you attach a network dump?

no

aler9 commented 4 months ago

Hello, i tried replicating the issue but in my case the number of goroutines remained constant, so there must be some specific configuration combination that triggers the leak. This is my configuration:

paths:
  nonexisting_url:
    source: whep://127.0.0.1:8889/nonexisting/whep

  nonexisting_host:
    source: whep://nonexisting:8889/nonexisting/whep

  working:
    source: whep://127.0.0.1:8889/stream/whep

can you provide a configuration that allows to trigger the leak?

also, can you provide a goroutine dump by using the integrated pprof server? you need to set pprof: yes inside the configuration and post the output of

go tool pprof -text http://localhost:9999/debug/pprof/goroutine
RouquinBlanc commented 4 months ago

Hello, the initial report was far from enough, apologies!

In fact, trying to isolate what's happening, I end up with the following minimal config:

pprof: yes

paths:
    m400_dltv_whep:
       source: whep://1.2.3.4:8889/m400_dltv/whep

The other side is a mediamtx instance (v1.5.1) on another machine with the following relevant config:

paths:
    m400_dltv:
        source: rtsp://5.6.7.8:5000/dltv1

The camera in question is a shitty one in terms of RTSP. It sends RTP/H264 but:

I noticed that in the mediamtx log, I get the following error in loop, which may hint on the issue location:

2024/03/09 19:07:52 ERR [path m400_dltv_whep] [WebRTC source] deadline exceeded while waiting tracks
2024/03/09 19:07:58 INF [path m400_dltv_whep] [WebRTC source] peer connection established, local candidate: host/udp/172.16.212.245/58274, remote candidate: host/udp/1.2.3.4/8189
[... again and again ...]

Attached is a dump of goroutines after a few minutes (basically the time it took to fill this message) goroutine_3118.txt

If that's not enough, I can try to take an anonymized recording of the video in question. Or find a way to modify to craft one.

It looks like it does connect successfully with WHEP, but because the video format is crap, it ends up retrying to connect, and probably misses some cleanup in the process?

aler9 commented 4 months ago

fixed by #3124