Closed filipochnik closed 2 years ago
@filipochnik Thank you for sharing. I hit the same issue. Would you mind if I asked you to share your fix?
@yudai I ran a separate goroutine waiting on the session's close channel and then closing the server. Something like:
go func() {
<-ln.(*yamux.Session).CloseChan()
srv.Close()
}()
@filipochnik I see. That's a neat idea that doesn't require changing yamux itself. Thanks!
NetError now implements Temporary and Timeout, so this should be fixed.
Consider a situation when we have a client-server connection and a keepalive on the client side fails. The underlying connection gets closed without sending GoAway message.
Now, on the server side, there's a race who notices the connection got closed first and calls
(*Session).exitErr()
. If(*Session).recv
wins (I think it also applies to(*Session).send
) when(*Session).recvLoop
is in theio.ReadFull
call, the(*Session).exitErr()
is called with "connection reset by peer"net.Error
. That error is written toSession.shutdownErr
and will be returned by(*Session).Accept
on all subsequent calls.And here's the problem. If that server
Session
is used as anet.Listener
withhttp.Server
it will cause an infinite error loop. That's because "connection reset by peer" is a temporary error and those are retried indefinitely, see here.Ideally, all yamux errors should implement
net.Error
and setTemporary()
andTimeout()
appropriately, but that would be a breaking API change. The backward compatible solution is to never return errors that implementnet.Error
because the context where they originally happen and the context where they are returned by yamux differ and lead to erroneous situations like this one. In practical terms, it simply means wrapping errors passed to(*Session).exitErr()
if they are not yamux errors.I can submit a PR for this but I would first like to know if that's something you would accept.