Open Timer opened 5 months ago
Similar Issues
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
cc @neild @tombergan
r.W.Header().Del("Server") Header doesn't need to be deleted? Because Response creates a new Header each time.
WriteHeader
should block until the headers have been sent, to avoid precisely this type of race. I've looked through this code path a few times now, and I don't see a way for the header map to be touched after WriteHeader
returns, but I might be missing something.
Do you possibly have a middleware layer wrapping the ResponseWriter
that might be deferring the WriteHeader
call to a later time?
If you can run this under the race detector (go build -race
), it might make the failure more reproducible.)
Timed out in state WaitingForInfo. Closing.
(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)
No, we are not wrapping the ResponseWriter
with a middleware that'd delay the write. I'm enabling the race detector in hopes that it prints the information about the access and will report back today!
Got the race! Sharing it here while I continue to evaluate it.
2024-08-09T19:19:38.955255887Z ==================
2024-08-09T19:19:38.955262912Z WARNING: DATA RACE
2024-08-09T19:19:38.955265201Z Write at 0x00c00bfa5740 by goroutine 7675145:
2024-08-09T19:19:38.955267226Z runtime.mapdelete_faststr()
2024-08-09T19:19:38.955269405Z /usr/local/go/src/runtime/map_faststr.go:301 +0x0
2024-08-09T19:19:38.955271096Z net/textproto.MIMEHeader.Del()
2024-08-09T19:19:38.955272820Z /usr/local/go/src/net/textproto/header.go:55 +0x4d
2024-08-09T19:19:38.955274358Z net/http.Header.Del()
2024-08-09T19:19:38.955275804Z /usr/local/go/src/net/http/header.go:81 +0x27
2024-08-09T19:19:38.955277621Z main.main.ProxyUpstream.func14()
2024-08-09T19:19:38.955279964Z /usr/src/https-terminator/middlewares/proxyhttp/http_handler.go:562 +0x17cb
2024-08-09T19:19:38.955281075Z main.main.REDACTED.func15()
2024-08-09T19:19:38.955282521Z /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:183 +0xc7e
2024-08-09T19:19:38.955283567Z main.main.REDACTED.func16()
2024-08-09T19:19:38.955284607Z /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:26 +0xba4
2024-08-09T19:19:38.955285748Z main.main.REDACTED.func17()
2024-08-09T19:19:38.955286753Z /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:24 +0x9cf
2024-08-09T19:19:38.955287988Z main.main.REDACTED.func18()
2024-08-09T19:19:38.955289160Z /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:53 +0x2e1
2024-08-09T19:19:38.955290119Z main.main.REDACTED.func19()
2024-08-09T19:19:38.955291243Z /usr/src/https-terminator/middlewares/threats/REDACTED.go:23 +0x273
2024-08-09T19:19:38.955292407Z main.main.REDACTED.func20()
2024-08-09T19:19:38.955295540Z /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:8 +0x19c
2024-08-09T19:19:38.955298336Z main.main.REDACTED.func21()
2024-08-09T19:19:38.955299811Z /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:17 +0x116
2024-08-09T19:19:38.955301227Z github.com/vercel/proxy/https-terminator/middlewares/proxyhttp.(*server).ServeHTTP()
2024-08-09T19:19:38.955302888Z /usr/src/https-terminator/middlewares/proxyhttp/server.go:201 +0x571
2024-08-09T19:19:38.955305221Z net/http.serverHandler.ServeHTTP()
2024-08-09T19:19:38.955306537Z /usr/local/go/src/net/http/server.go:2943 +0x2a1
2024-08-09T19:19:38.955307491Z net/http.initALPNRequest.ServeHTTP()
2024-08-09T19:19:38.955308475Z /usr/local/go/src/net/http/server.go:3551 +0x35e
2024-08-09T19:19:38.955309699Z net/http.(*initALPNRequest).ServeHTTP()
2024-08-09T19:19:38.955310705Z <autogenerated>:1 +0x7b
2024-08-09T19:19:38.955311715Z net/http.Handler.ServeHTTP-fm()
2024-08-09T19:19:38.955312916Z <autogenerated>:1 +0x67
2024-08-09T19:19:38.955313934Z golang.org/x/net/http2.(*serverConn).runHandler()
2024-08-09T19:19:38.955314966Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:2414 +0x25a
2024-08-09T19:19:38.955315980Z golang.org/x/net/http2.(*serverConn).scheduleHandler.func1()
2024-08-09T19:19:38.955322658Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:2348 +0x5d
2024-08-09T19:19:38.955323910Z
2024-08-09T19:19:38.955325457Z Previous read at 0x00c00bfa5740 by goroutine 7675260:
2024-08-09T19:19:38.955327005Z runtime.mapiterinit()
2024-08-09T19:19:38.955328549Z /usr/local/go/src/runtime/map.go:816 +0x0
2024-08-09T19:19:38.955329933Z golang.org/x/net/http2.(*sorter).Keys()
2024-08-09T19:19:38.955331759Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/http2.go:344 +0x8f
2024-08-09T19:19:38.955333741Z golang.org/x/net/http2.encodeHeaders()
2024-08-09T19:19:38.955334817Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/write.go:340 +0x13d
2024-08-09T19:19:38.955335846Z golang.org/x/net/http2.(*writeResHeaders).writeFrame()
2024-08-09T19:19:38.955337101Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/write.go:217 +0x1cd
2024-08-09T19:19:38.955338182Z golang.org/x/net/http2.(*serverConn).writeFrameAsync()
2024-08-09T19:19:38.955339244Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:890 +0x116
2024-08-09T19:19:38.955340382Z golang.org/x/net/http2.(*serverConn).startFrameWrite.func2()
2024-08-09T19:19:38.955341415Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1304 +0x6e
2024-08-09T19:19:38.955342347Z
2024-08-09T19:19:38.955343398Z Goroutine 7675145 (running) created at:
2024-08-09T19:19:38.955356990Z golang.org/x/net/http2.(*serverConn).scheduleHandler()
2024-08-09T19:19:38.955358243Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:2348 +0x28a
2024-08-09T19:19:38.955359222Z golang.org/x/net/http2.(*serverConn).processHeaders()
2024-08-09T19:19:38.955360262Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:2070 +0xd30
2024-08-09T19:19:38.955361329Z golang.org/x/net/http2.(*serverConn).processFrame()
2024-08-09T19:19:38.955362448Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1571 +0x3f9
2024-08-09T19:19:38.955363452Z golang.org/x/net/http2.(*serverConn).processFrameFromReader()
2024-08-09T19:19:38.955364574Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1509 +0x324
2024-08-09T19:19:38.955365563Z golang.org/x/net/http2.(*serverConn).serve()
2024-08-09T19:19:38.955366580Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:998 +0x17bc
2024-08-09T19:19:38.955367960Z golang.org/x/net/http2.(*Server).serveConn()
2024-08-09T19:19:38.955369001Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:572 +0x1a90
2024-08-09T19:19:38.955370090Z golang.org/x/net/http2.(*Server).ServeConn()
2024-08-09T19:19:38.955371192Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:436 +0x185
2024-08-09T19:19:38.955372317Z golang.org/x/net/http2.ConfigureServer.func1()
2024-08-09T19:19:38.955373408Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:355 +0xbe
2024-08-09T19:19:38.955374353Z net/http.(*conn).serve()
2024-08-09T19:19:38.955375307Z /usr/local/go/src/net/http/server.go:1922 +0x1d41
2024-08-09T19:19:38.955376720Z net/http.(*Server).Serve.func3()
2024-08-09T19:19:38.955377697Z /usr/local/go/src/net/http/server.go:3091 +0x4f
2024-08-09T19:19:38.955378552Z
2024-08-09T19:19:38.955379499Z Goroutine 7675260 (finished) created at:
2024-08-09T19:19:38.955380584Z golang.org/x/net/http2.(*serverConn).startFrameWrite()
2024-08-09T19:19:38.955381555Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1304 +0x77b
2024-08-09T19:19:38.955382677Z golang.org/x/net/http2.(*serverConn).scheduleFrameWrite()
2024-08-09T19:19:38.955383614Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1406 +0x3c8
2024-08-09T19:19:38.955384616Z golang.org/x/net/http2.(*serverConn).writeFrame()
2024-08-09T19:19:38.955389980Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1252 +0x3c7
2024-08-09T19:19:38.955391151Z golang.org/x/net/http2.(*serverConn).serve()
2024-08-09T19:19:38.955392125Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:985 +0x15b1
2024-08-09T19:19:38.955393350Z golang.org/x/net/http2.(*Server).serveConn()
2024-08-09T19:19:38.955394737Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:572 +0x1a90
2024-08-09T19:19:38.955396303Z golang.org/x/net/http2.(*Server).ServeConn()
2024-08-09T19:19:38.955398538Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:436 +0x185
2024-08-09T19:19:38.955399815Z golang.org/x/net/http2.ConfigureServer.func1()
2024-08-09T19:19:38.955401363Z /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:355 +0xbe
2024-08-09T19:19:38.955402833Z net/http.(*conn).serve()
2024-08-09T19:19:38.955404301Z /usr/local/go/src/net/http/server.go:1922 +0x1d41
2024-08-09T19:19:38.955405880Z net/http.(*Server).Serve.func3()
2024-08-09T19:19:38.955407377Z /usr/local/go/src/net/http/server.go:3091 +0x4f
2024-08-09T19:19:38.955408869Z ==================
I may be missing it, but I don't see where WriteHeader
blocks until the write is complete.
The http2 WriteHeader
implementation (golang.org/x/net@v0.26.0/http2/server.go
) uses the writeHeader
impl in responseWriterState
(:2943
).
For 1xx status codes, it calls rws.conn.writeHeaders
:
func (rws *responseWriterState) writeHeader(code int) {
if rws.wroteHeader {
return
}
checkWriteHeaderCode(code)
// Handle informational headers
if code >= 100 && code <= 199 {
// Per RFC 8297 we must not clear the current header map
h := rws.handlerHeader
_, cl := h["Content-Length"]
_, te := h["Transfer-Encoding"]
if cl || te {
h = h.Clone()
h.Del("Content-Length")
h.Del("Transfer-Encoding")
}
rws.conn.writeHeaders(rws.stream, &writeResHeaders{
streamID: rws.stream.id,
httpResCode: code,
h: h,
endStream: rws.handlerDone && !rws.hasTrailers(),
})
return
}
writeHeaders
submits the call to writeFrameFromHandler
(golang.org/x/net@v0.26.0/http2/server.go:2440
). This function does not wait for any signals in the success case:
func (sc *serverConn) writeHeaders(st *stream, headerData *writeResHeaders) error {
sc.serveG.checkNotOn() // NOT on
var errc chan error
if headerData.h != nil {
// If there's a header map (which we don't own), so we have to block on
// waiting for this frame to be written, so an http.Flush mid-handler
// writes out the correct value of keys, before a handler later potentially
// mutates it.
errc = errChanPool.Get().(chan error)
}
if err := sc.writeFrameFromHandler(FrameWriteRequest{
write: headerData,
stream: st,
done: errc,
}); err != nil {
return err
}
if errc != nil {
select {
case err := <-errc:
errChanPool.Put(errc)
return err
case <-sc.doneServing:
return errClientDisconnected
case <-st.cw:
return errStreamClosed
}
}
return nil
}
And the writeFrameFromHandler
impl simply drops it on the wantWriteFrameCh
channel for writing (a buffered channel).
// writeFrameFromHandler sends wr to sc.wantWriteFrameCh, but aborts
// if the connection has gone away.
//
// This must not be run from the serve goroutine itself, else it might
// deadlock writing to sc.wantWriteFrameCh (which is only mildly
// buffered and is read by serve itself). If you're on the serve
// goroutine, call writeFrame instead.
func (sc *serverConn) writeFrameFromHandler(wr FrameWriteRequest) error {
sc.serveG.checkNotOn() // NOT
select {
case sc.wantWriteFrameCh <- wr:
return nil
case <-sc.doneServing:
// Serve loop is gone.
// Client has closed their connection to the server.
return errClientDisconnected
}
}
Perhaps the header map sent in to rws.conn.writeHeaders
needs to clone h
since it all seems to happen async? Or am I missing what causes the blocking write?
cc @neild in case you unsubscribed from the issue after being closed 🙏, hope you don't mind.
Thanks for the race detector output; that was very useful.
The problem is that while writeHeaders usually waits for the write to complete before returning, it stops waiting if the stream is reset by the peer. The race occurs when the header write has already started at the time the peer's reset is processed. (This is the <-st.cw
path in writeHeaders.)
I'm still trying to figure out the right fix; there's a fairly complex set of interactions between the different goroutines involved in writing a frame. Most of this complexity is in service of efficiency, and I suspect the more obvious fixes leave us with all the complexity and none of the efficiency.
If it's of any anecdotal value, for latency/efficiency sake, we'd prefer WriteHeader
to be async so we can continue processing the request without introducing a blocking operation.
Also, I did not catch that errc
was a chan and thus responsible for the blocking operation. Makes sense!
In either case, the most trivial fix appears to be moving the h = h.Clone()
out of the special condition that activates for CL & TE... but perhaps not the most performant.
Go version
go version go1.22.1 linux/arm64
Output of
go env
in your module/workspace:What did you do?
We run a HTTP/2 server that flushes an early headers map (status 103 - http.StatusEarlyHints) using
func (http.ResponseWriter) WriteHeader(statusCode int)
.After calling WriteHeader, we may need to mutate the Header map for the final 2xx response based on information we discover. In our application, we simply clear the Headers after flushing the early hint:
What did you see happen?
After >60 minutes of runtime and millions of requests (this is hard to reproduce), we were faced with this crash:
This appears to be due to
writeFrameAsync
running on a separate goroutine and theencodeHeaders
function doing a naive iteration over the values (vv := h[k]
, line 343 from above).What did you expect to see?
Encoding headers needs to be concurrent-safe now that a server can send and mutate its headers multiple times before sending the final response (e.g. 103, 103, 103, 200).