x/net/http2: raciness in stream close prevents closing of failed connections, blocking subsequent requests

btasker commented 1 year ago

What version of Go are you using (`go version`)?

go version go1.20.3 linux/amd6

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.20.3"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2968508177=/tmp/go-build -gno-record-gcc-switches"

What did you do?

A connection is established to a remote server (using HTTP/2 where available) and requests are sent periodically over this connection - it's essentially a write pipeline working through a queue sequentially.

This being the internet, sometimes bad things happen to individual connections and packets go awry.

If a timeout occurs, the connection will (well, should) be closed:

resp, err := client.Do(req.WithContext(ctx))
if err != nil {
            // Attempt to tear down the connection
            fmt.Println("Timed out")
            client.CloseIdleConnections()
        }

(that CloseIdleConnections() call is sourced from https://github.com/golang/go/issues/36026)

What did you expect to see?

On the next request (generally a retry), a new connection should be established (if possible) and writes should resume.

What did you see instead?

The underlying TCP connection does not get closed - the kernel sits retransmitting the most recent packets (until net.ipv4.tcp_retries2 is reached about 17mins later).

Subsequent requests all attempt to use the dead connection and so time out.

Repro

The following can be used to help repro the behaviour

package main

import (
    "context"
    "fmt"
    "net/http"
    "time"
    "strings"
)

func main() {

    // The url to place requests to
    url := "https://www.bentasker.co.uk/"

    // Create an empty context
    ctx := context.Background()

    // Create an io.Reader from a string to act as the request body
    bodystring := "foo\nbar\nsed"
    myReader := strings.NewReader(bodystring)

    // build client and request
    client := http.Client{
        Timeout: 5 * time.Second,
    }

    // Set up an infinite loop to place requests from
    for {
        req, err := http.NewRequest("POST", url, myReader)
        if err != nil {
            fmt.Println("boo")
        }
        req.Header.Set("Content-Type", "text/plain; charset=utf-8")

        resp, err := client.Do(req.WithContext(ctx))

        if err != nil {
            // Attempt to tear down the connection
            fmt.Println("Timed out")
            client.CloseIdleConnections()
        } else {
            fmt.Println(resp.StatusCode)
        }

        time.Sleep(time.Second)
    }
}

If you build and run that, you should see 403 printed on a new line every second or so (it being 403 isn't pertinent here, it's just easy/convenient to generate)

Check which source port is being used

root@ratchett:/home/ben/tmp/go_repro# netstat -anp | grep main
tcp        0   2185 192.168.3.217:50446     143.244.38.137:443      ESTABLISHED 2389361/./main

and then block it in iptables to break the return path

iptables -I INPUT -p tcp --dport 50446 -j DROP

The script should stop printing 403s and after around 5s will print Timed out.

Although the code calls CloseIdleConnections(), it'll silently fail to do anything to the connection (more below)

Subsequent attempts will also print Time out and re-running netstat will show that same connection still in an ESTABLISHED state. If you take a packet capture it'll show re-transmission of whatever packet we interrupted by adding the rule.

If you remove the iptables rule once the remote server starts sending resets, the kernel will tear down the connection and requests will go back to behaving as they should.

Hacking it out

It is actually possible to work around this at the caller's level, although it's not exactly pleasant to look at.

        if err != nil {
            // Attempt to tear down the connection
            fmt.Println("Timed out")

            // prompt the handler into marking the connection as not
           // being reusable
            req.Header.Set("Connection", "close")
            client.Do(req.WithContext(ctx))

            // this is no longer required
            // client.CloseIdleConnections()
        } else {
            fmt.Println(resp.StatusCode)
        }

This works because net/http2 specifically checks for a Connection: close header before placing a request, and set's doNotReuse on the ClientConn, so when the next request comes around, the connection won't be used even if it is otherwise considered still active.

Underlying Issue

The underlying seems to be that there's some raciness between normal stream closure and closeIdleConnections().

The crucial check is here

    if len(cc.streams) > 0 || cc.streamsReserved > 0 {
        cc.mu.Unlock()
        return
    }

If the map cc.streams contains any entries, the connection isn't considered idle and therefore can't be closed.

Dropping some prints in as a quick test shows it's this conditional we're failing at

@@ -7984,6 +7990,8 @@ func (cc *http2ClientConn) forceCloseConn() {
 func (cc *http2ClientConn) closeIfIdle() {
        cc.mu.Lock()
        if len(cc.streams) > 0 || cc.streamsReserved > 0 {
+                fmt.Println("Refusing to close - streams:", len(cc.streams), " reserved:", cc.streamsReserved)
+
                cc.mu.Unlock()
                return
        }

Gives

Refusing to close - streams: 1  reserved: 0

net/http2 does actually try to tidy up the streams, doRequest calls cleanupWriteRequest which ultimately drops the broken stream.

Adding some more prints though, shows that this only tends to happen after the caller has had opportunity to call closeIdleConnections():

Refusing to close - streams: 1  reserved: 0
Doing cleanup context deadline exceeded
Cleanup stream 5 got err context deadline exceeded
Must close body:  false
Got err Sent headers
Writing reset
Forgetting ID  5
werr  <nil>
closing

The string Forgetting ID was added here.

Raciness

Timings definitely seem to be important here - if the time.Sleep call in the repro is adjusted to wait 5 seconds, you'll sometimes see a string of timeouts followed by writes recovering.

That's not the only raciness though.

Stream closures/timeouts etc are handled by a select statement here

    for {
        select {
        case <-cs.peerClosed:
            return nil
        case <-respHeaderTimer:
            return errTimeout
        case <-respHeaderRecv:
            respHeaderRecv = nil
            respHeaderTimer = nil // keep waiting for END_STREAM
        case <-cs.abort:
            return cs.abortErr
        case <-ctx.Done():
            return ctx.Err()
        case <-cs.reqCancel:
            return errRequestCanceled
        }
    }

Sometimes this is triggered by cs.abort and sometimes by ctx.Done(). This appears to be as a result of an attempt to abort the stream when the context hits deadline - so there's an element of potluck in terms of which gets through (I've not dug into this in too much depth though).

TL:DR

CloseIdleConnections() relies on closeIfIdle() which will close a connection only if there are no active streams on that connection.

However, when a timeout occurs, there may be a delay before the stream is closed (leaving it visible to closeIfIdle()).

By the time the stream closure occurs, the timeout has already been bubbled up to the caller, who may then choose to call CloseIdleConnections() knowing that the connection has failed.

Because the dead connection does not get closed by this call, it remains available for use by the next request.

The result is that requests will fail until the kernel itself gives up on the connection, or a RST or FIN makes it through from the remote end.

btasker commented 1 year ago

I think the simplest way to address this would be to have that select statement check whether the current stream is the only one in the ClientConn and if so, set DoNotReuse.

I don't know whether it's the best method though - HTTP/2's multiplexing complicates things.

At the net/http2 level there doesn't seem to be a way to tell whether a timeout is a connection problem or just that the other end is being slow (or the caller setting an overly low timeout).

As a result, receiving a timeout doesn't necessarily mean that the connection itself is dead: tearing it down might therefore impact working requests transiting the same connection via other streams.

Setting doNotReuse on the connection feels like an acceptable compromise: it allows other streams to carry on unimpeded (or hit their own timeouts), whilst ensuring that a potentially problematic connection is not reused.

It'd probably be tidier to extend ClientConn to include a hasErrored property and achieve the same end, but I'll link a PR in in a minute using doNotReuse - I'm happy to go the other way, but it didn't seem worth the effort until it's confirmed that this is the right path.

ianlancetaylor commented 1 year ago

CC @neild @bradfitz

neild commented 1 year ago

Refusing to reuse a connection that's not successfully handling requests seems reasonable to me.

False positives (a request timing out because of a slow server handler, not a bad connection, say) will result in additional connections being unnecessarily created, which seems less bad than continuing to send requests to an unresponsive connection.

gopherbot commented 1 year ago

Change https://go.dev/cl/486156 mentions this issue: http2: don't reuse connections that are experiencing errors

gopherbot commented 1 year ago

Change https://go.dev/cl/490335 mentions this issue: net/http2: don't re-use connections that are experiencing errors

ianlancetaylor commented 1 year ago

@gopherbot Please open backport to 1.19.

Rationale quoting #60301 (which is for the other open minor release):

When encountered, this is a serious issue: requests are blocked until the kernel gives up on the socket (with default settings on a Linux host, that's around 17 minutes). If the operator restarts the application as a result, data loss may occur.

Although the fix is in x/net v0.10.0 it will not be available to most developers who likely import net/http (and so rely on the h2 bundle rolled into Go's releases)

The only viable workaround is to disable the H2 client entirely via environment variable.

gopherbot commented 1 year ago

Backport issue(s) opened: #60301 (for 1.20), #60662 (for 1.19).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

gopherbot commented 1 year ago

Change https://go.dev/cl/507395 mentions this issue: http2: revert Transport change from CL 486156

serbrech commented 11 months ago

@btasker is

http2Trans, err := http2.ConfigureTransports(transport)
if err == nil {
   http2Trans.ReadIdleTimeout = time.Duration(ReadIdleTimeout)
}

still a valid workaround for go 1.20.x? (I assume yes since I see the fix planned for 1.20.7 Milestone now) What about 1.21?

Is the ReadIdleTimeout of the http2 transport redundant with https://pkg.go.dev/net#Dialer.KeepAlive?

btasker commented 11 months ago

Hey @serbrech

still a valid workaround for go 1.20.x? (I assume yes since I see the fix planned for 1.20.7 Milestone now) What about 1.21?

Yep, still valid for both.

Is the ReadIdleTimeout of the http2 transport redundant with https://pkg.go.dev/net#Dialer.KeepAlive?

I've not checked the interaction, but I suspect it's probably not redundant, no (as the dialer keepalive should be operating at a lower level).

golang / go