pion / turn

Pion TURN, an API for building TURN clients and servers
MIT License
1.87k stars 319 forks source link

Crash in HandleInbound #346

Closed jech closed 3 months ago

jech commented 1 year ago

I got myself a crash:

Jul 24 10:18:46 galene galene[987606]: panic: runtime error: invalid memory address or nil pointer dereference
Jul 24 10:18:46 galene galene[987606]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x7bebcd]
Jul 24 10:18:46 galene galene[987606]: goroutine 47302 [running]:
Jul 24 10:18:46 galene galene[987606]: github.com/pion/turn/v2.(*Client).HandleInbound(0xc0006c79e0, {0xc000ea4000?, 0xffff?, 0xffff?}, {0xb449c0?, 0xc001705110?})
Jul 24 10:18:46 galene galene[987606]:         /home/jch/go/pkg/mod/github.com/pion/turn/v2@v2.1.2/client.go:489 +0xad
Jul 24 10:18:46 galene galene[987606]: github.com/pion/turn/v2.(*Client).Listen.func1()
Jul 24 10:18:46 galene galene[987606]:         /home/jch/go/pkg/mod/github.com/pion/turn/v2@v2.1.2/client.go:184 +0x94
Jul 24 10:18:46 galene galene[987606]: created by github.com/pion/turn/v2.(*Client).Listen
Jul 24 10:18:46 galene galene[987606]:         /home/jch/go/pkg/mod/github.com/pion/turn/v2@v2.1.2/client.go:175 +0x105

This is github.com/pion/turn/v2 v2.1.2

rg0now commented 1 year ago

Can you please provide a bit more detail? Why I'm asking this is that the segfault occurs at a quite curious place: client.go:489 contains this code:

case from.String() == c.stunServerAddr.String():

Now my understanding is that when client.HandleInbound is called via client.Listen then both from and c.stunServerAddr come from verified code and can never be nil. So I guess you're getting the segfault on some code that calls HandleInbound by itself, but without further context it is impossible to know what went wrong.

I guess you are using galene: maybe you should ask the galene developers?

jech commented 1 year ago

Can you please provide a bit more detail?

Unfortunately not. The log is all I have.

Now my understanding is that [...] can never be nil

After staring at the code for an hour or so, I've come to the same conclusion. I'm as puzzled as you are.

maybe you should ask the galene developers?

I'm the Galene developer :-/

jech commented 1 year ago

Raja Subramanian on Slack:

We were hitting this in LiveKit server too. I checked Sean's partial revert PR (https://github.com/pion/turn/commit/9ebb7c76ea4102ee4150556b257dc043cdb87bb7) which went into this release and noticed that there was a check for empty STUN server before that revert. So, I added this check a couple of days back - https://github.com/pion/turn/commit/5484f25e4c38729930ed8dc20f3e46a383f09fb5 But, I did not dig to figure out if it can be nil. I just did a code compare to previous version and concluded that it must have been possible and that's why the check was there before and it got missed in the revert.

jech commented 1 year ago

Okay, I'm increasingly confused.

pion/ice creates a TURN client here: https://github.com/pion/ice/blob/master/gather.go#L656. If I read the code correctly, this always sets c.stunServerAddr to nil, right?

Is that correct? Shouldn't we be setting stunServerAddr to the address of the TURN server if no STUN server has been specified? Perhaps @Sean-Der can enlighten us.

CC @boks1971

Sean-Der commented 3 months ago

Sorry I didn't follow up on this earlier :/

Fixed with https://github.com/pion/turn/commit/5484f25e4c38729930ed8dc20f3e46a383f09fb5