hashicorp / go-retryablehttp

Retryable HTTP client in Go
Mozilla Public License 2.0
1.99k stars 251 forks source link

Asynchronous Connection Closing Causes Panic in HTTP/2 #229

Open LeonanCarvalho opened 4 months ago

LeonanCarvalho commented 4 months ago

Versions:

Description: I encountered a panic related to HTTP/2 connections while using the go-retryablehttp library. The panic message is as follows:

panic: id (4) <= evictCount (10)
goroutine 40543 [running]:
vendor/golang.org/x/net/http2/hpack.(*headerFieldTable).idToIndex(0xc000466700, 0xc0007e7320?)
    /usr/local/go/src/vendor/golang.org/x/net/http2/hpack/tables.go:118 +0xbd
vendor/golang.org/x/net/http2/hpack.(*headerFieldTable).search(0xc000466700, {{0x13f4875, 0x6}, {0xc000c0a410, 0x10}, 0x0})
    /usr/local/go/src/vendor/golang.org/x/net/http2/hpack/tables.go:105 +0xe5
vendor/golang.org/x/net/http2/hpack.(*Encoder).searchTable(0xc000466700, {{0x13f4875, 0x6}, {0xc000c0a410, 0x10}, 0x0})
    /usr/local/go/src/vendor/golang.org/x/net/http2/hpack/encode.go:97 +0x85
vendor/golang.org/x/net/http2/hpack.(*Encoder).WriteField(0xc000466700, {{0x13f4875, 0x6}, {0xc000c0a410, 0x10}, 0x0})
    /usr/local/go/src/vendor/golang.org/x/net/http2/hpack/encode.go:62 +0x145
net/http.(*http2ClientConn).writeHeader(0xc000baeec0?, {0x13f4875?, 0xffffffffffffffff?}, {0xc000c0a410?, 0xc000743a70?})
    /usr/local/go/src/net/http/h2_bundle.go:9177 +0x148
net/http.(*http2ClientConn).encodeHeaders.func3({0xc000baeec0?, 0xc000743a70?}, {0xc000c0a410, 0x10})
    /usr/local/go/src/net/http/h2_bundle.go:9111 +0x71
net/http.(*http2ClientConn).encodeHeaders.func1(0xc000a3abe0)
    /usr/local/go/src/net/http/h2_bundle.go:9072 +0x627
net/http.(*http2ClientConn).encodeHeaders(0xc000352900, 0xc00069e000, 0x1, {0x0, 0x0}, 0x0)
    /usr/local/go/src/net/http/h2_bundle.go:9104 +0x65e
net/http.(*http2clientStream).encodeAndWriteHeaders(0xc000352a80, 0xc00069e000)
    /usr/local/go/src/net/http/h2_bundle.go:8578 +0x2ef
net/http.(*http2clientStream).writeRequest(0xc000352a80, 0xc00069e000)
    /usr/local/go/src/net/http/h2_bundle.go:8474 +0x5a7
net/http.(*http2clientStream).doRequest(0xc000352a80, 0x0?)
    /usr/local/go/src/net/http/h2_bundle.go:8392 +0x18
created by net/http.(*http2ClientConn).RoundTrip in goroutine 40484

It's something very complicated to reproduce, it seems that the condition is very specific, perhaps you have more experience with the lib to find a common point for correction.

So far my guess is that this issue is related to the asynchronous closing of connections. According to the discussion in the issue golang/go#50027:

CloseIdleConnections closes all idle connections synchronously.

"Closing" a connection marks the connection as closed (sets *ClientConn.closed), so it should never be used for any future requests. It also closes the connection net.Conn. However, closing a connection does not block until the connection read loop returns, and the read loop is what removes the *ClientConn from the connection pool.

So after CloseIdleConnections, some connections may be no longer in use, but still temporarily in the connection pool.

Perhaps closing a connection should block until the connection goroutines have all returned.

Would it be possible to adjust the implementation in go-retryablehttp to ensure that all connection goroutines are finished before removing the connection from the pool? This would help prevent panics like the one mentioned above.

Thank you!

More related issues :

LeonanCarvalho commented 1 month ago

Just to update this issue: still happening on latest go/lib versions