philippseith / signalr

SignalR server and client in go
MIT License
131 stars 39 forks source link

Panic when connecting to old servers (or WebTransports) #193

Closed sruehl closed 7 months ago

sruehl commented 8 months ago

When hitting a server which reports that it wants to use WebTransports signalR brings down the whole process due to a programming error. This error can't be gracefully handled as it is a panic within a Goroutine singalR spawns.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1f912e1]

goroutine 748 [running]:
github.com/philippseith/signalr.(*client).sendHandshakeRequest(0xc00062e640)
    github.com/philippseith/signalr@v0.6.3/client.go:575 +0x41
github.com/philippseith/signalr.(*client).processHandshake(0x2711400?)
    github.com/philippseith/signalr@v0.6.3/client.go:568 +0x18
github.com/philippseith/signalr.(*client).setupConnectionAndProtocol.func1(0xc00062e640)
    github.com/philippseith/signalr@v0.6.3/client.go:299 +0x137
github.com/philippseith/signalr.(*client).setupConnectionAndProtocol(0x7f6384d16d00?)
    github.com/philippseith/signalr@v0.6.3/client.go:305 +0x13
github.com/philippseith/signalr.(*client).run(0xc00062e640)
    github.com/philippseith/signalr@v0.6.3/client.go:230 +0x1c
github.com/philippseith/signalr.(*client).Start.func1()
    github.com/philippseith/signalr@v0.6.3/client.go:188 +0x11d
created by github.com/philippseith/signalr.(*client).Start in goroutine 43
    github.com/philippseith/signalr@v0.6.3/client.go:171 +0xa6

Edit: the stack above came actually when trying to connect to a old server. However in both cases the panic is the same

sruehl commented 8 months ago

As a note on a possible option to have a handle in the future. There could be a global DeferFunc which you can set and then a User of the library can use this to recover from panics in Goroutines spamed by signalR

sruehl commented 8 months ago

@philippseith this is a bit urgent because as I wrote this brings the process down at the moment. So either merge #192 and I rebase #191 on it or merge #191 directly and get user feedback on potential bugs. At the moment I have a local replace directive on #192 which seems to not always work and when it falls back to current master it crashes my process as described in the OP

sruehl commented 8 months ago

~One thing I’ve noticed could be that the analysis might be wrong because that NPE could also be the result of a missing default branch~ Edit: fixed in latest commit in #192