Closed tyohan closed 4 months ago
The test failed because udpmux start goroutine to listen to the udp port and it is not closed in the test that is expected to fail, after running your test code locally I can't find any data-channel related goroutine in the test report.
3 @ 0x43e54e 0x46dc19 0x46dbf9 0x47ae45 0x873516 0x878434 0x935aba 0x471a01
# 0x46dbf8 sync.runtime_notifyListWait+0x138 /usr/local/go/src/runtime/sema.go:527
# 0x47ae44 sync.(*Cond).Wait+0x84 /usr/local/go/src/sync/cond.go:70
# 0x873515 github.com/pion/sctp.(*Stream).ReadSCTP+0xd5 /go/pkg/mod/github.com/pion/sctp@v1.8.13/stream.go:146
# 0x878433 github.com/pion/datachannel.(*DataChannel).ReadDataChannel+0x53 /go/pkg/mod/github.com/pion/datachannel@v1.5.5/datachannel.go:193
# 0x935ab9 github.com/pion/webrtc/v3.(*DataChannel).readLoop+0xb9 /go/pkg/mod/github.com/pion/webrtc/v3@v3.2.32/datachannel.go:361
2 @ 0x43e54e 0x44e985 0x815c6f 0x9239fc 0x88cd0f 0x471a01
# 0x815c6e github.com/pion/transport/v2/packetio.(*Buffer).Read+0x1ae /go/pkg/mod/github.com/pion/transport/v2@v2.2.4/packetio/buffer.go:267
# 0x9239fb github.com/pion/webrtc/v3/internal/mux.(*Endpoint).Read+0x1b /go/pkg/mod/github.com/pion/webrtc/v3@v3.2.32/internal/mux/endpoint.go:40
# 0x88cd0e github.com/pion/srtp/v2.(*session).start.func1+0xae /go/pkg/mod/github.com/pion/srtp/v2@v2.0.18/session.go:144
It seems like a peerconnection leak in you code that the srtp session keeps opening.
@cnderrauber thank you for trying my code. This issue might not directly related to data-channel but it is more to the single port muxer. The SRTP session keeps opening because the buffer read function also stuck waiting the new packet or the connection is closed which is never closed in single port muxer. I'll try to dig more and see if this is more on my end instead of Pion related. Will keep it updated in this issue.
I like to close this for now. It seems the bug is caused by not closing a failed peer connection. Now I always closed a failed peer connection and it seems the leaks is gone. I will keep monitoring this on my side if not closing a failed peer connection is a real cause of this issue and reopen this if the goroutine leak is happening again.
Thank you.
Your environment.
What did you do?
I saw a goroutine leaks in my SFU when using datachannel with a multi mux with single port. The go routine is not able Able to reproduce this with a test in my fork. The test is failed caused by routine check after run a test.
What did you expect?
Pass the test that I added when reproduce this issue.
The test is not close the single port muxer which is expected because when running a SFU server with a single port muxer the
ice.NewMultiUDPMuxFromPort
mux listener will keep open until the SFU is shut down. The test can be passed when the mux is closed, but not when the mux is keep open.When I check with pprof, the blocked goroutines are listed like this:
And based from this, I traced the issue is caused by
sync.(*Cond).Wait()
and it never resolved even the peer connection is closed. I assumed the mux endpoint is not get the closed event because the mux is actually never closed when using a single port muxer. I happy to fix this bug to help me learn the codebase and able to contribute more to this Pion project, but will be helpful if there is any pointing to where I should looking.Thanks