Closed ncw closed 1 year ago
The h2_bundle.go file should have instructions on how it was generated in its header
It does indeed!
If I try
/opt/go/go-tip/src/net/http$ bundle -o=h2_bundle.go -prefix=http2 '-tags=!nethttpomithttp2' golang.org/x/net/http2
It works, but I get a whole lot of diff lines like this showing that it doesn't believe it is part of the net/http
package.
@@ -742,7 +743,7 @@ type http2ClientConnPool interface {
// call, so the caller should not omit it. If the caller needs
// to, ClientConn.RoundTrip can be called with a bogus
// new(http.Request) to release the stream reservation.
- GetClientConn(req *Request, addr string) (*http2ClientConn, error)
+ GetClientConn(req *http.Request, addr string) (*http2ClientConn, error)
MarkDead(*http2ClientConn)
}
I tried mucking around with the -pkg
header but I couldn't make it work.
Next hint?
Try -dst net/http
cc @neild
It looks like this is the correct command line now
@@ -2,7 +2,7 @@
// +build !nethttpomithttp2
// Code generated by golang.org/x/tools/cmd/bundle. DO NOT EDIT.
-// $ bundle -o=h2_bundle.go -prefix=http2 -tags=!nethttpomithttp2 golang.org/x/net/http2
+// $ bundle -dst net/http -o=h2_bundle.go -prefix=http2 -tags=!nethttpomithttp2 golang.org/x/net/http2
// Package http2 implements the HTTP/2 protocol.
//
However I can't get it to check out any other version than latest which is scuppering my plans to try a bisect
/opt/go/go-tip/src/net/http$ bundle -dst net/http -o=h2_bundle.go -prefix=http2 '-tags=!nethttpomithttp2' golang.org/x/net/http2@master
-: can only use path@version syntax with 'go get' and 'go install' in module-aware mode
bundle: failed to load source package
/opt/go/go-tip/src/net/http$ bundle -dst net/http -o=h2_bundle.go -prefix=http2 '-tags=!nethttpomithttp2' golang.org/x/net/http2@c960675eff93b7ce235fa4b1578d83fa48ea129c
-: can only use path@version syntax with 'go get' and 'go install' in module-aware mode
bundle: failed to load source package
Any ideas?
the version should be controlled by src/go.mod
Though it may have been easier for your project to do use http2.ConfigureServer instead of changing the bundled version?
For the benefit of those who come after I used http.ConfigureTransport
to plumb the http2 client code into rclone with
t.TLSNextProto = nil
err := http2.ConfigureTransport(t)
if err != nil {
log.Fatal(err)
}
And used a replace
directive in go.mod
replace golang.org/x/net => /opt/go/x/net
And that make it a whole lot easier to bisect - thanks for the hint @seankhliao
The commit which causes the regression is this one by @neild
Unfortunately it is a large commit...
cedda3a722ddc29587099e7176bb992c7c072a60 is the first bad commit
commit cedda3a722ddc29587099e7176bb992c7c072a60
Author: Damien Neil <dneil@google.com>
Date: Thu Sep 2 13:22:38 2021 -0700
http2: refactor request write flow
Move the entire request write into a new writeRequest function,
which runs as its own goroutine.
The writeRequest function handles all indefintely-blocking
operations (in particular, network writes), as well as all
post-request cleanup: Closing the request body, sending a
RST_STREAM when necessary, releasing the concurrency slot
held by the stream, etc.
Consolidates several goroutines used to wait for stream
slots, write the body, and close response bodies.
Ensures that RoundTrip does not block past request cancelation.
Change-Id: Iaf8bb3e17de89384b031ec4f324918b5720f5877
Reviewed-on: https://go-review.googlesource.com/c/net/+/353390
Trust: Damien Neil <dneil@google.com>
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
http2/client_conn_pool.go | 1 +
http2/pipe.go | 11 +
http2/transport.go | 965 +++++++++++++++++++++-------------------------
http2/transport_test.go | 99 +++--
4 files changed, 520 insertions(+), 556 deletions(-)
I've been trying to pin down exactly what the slowdown is. It seems that each HTTP request / response has a roughly 5 second time penalty.
I've managed to make a little reproduction of this which you can run yourselves. I put the code in a gist also (along with go.mod and go.sum): https://gist.github.com/ncw/016e5ccdc69000225b03db43e00f8c3a
package main
import (
"flag"
"fmt"
"io"
"log"
"net/http"
"os"
"time"
"golang.org/x/net/http2"
)
func get(url string) {
t0 := time.Now()
resp, err := http.Head(url)
dt := time.Since(t0)
if err != nil {
log.Printf("Failed to fetch %q: %v", url, err)
return
}
defer resp.Body.Close()
_, err = io.Copy(os.Stdout, resp.Body)
if err != nil && err != io.EOF {
log.Printf("Failed to read from %q: %v", url, err)
return
}
fmt.Println()
log.Printf("%q: read OK", url)
log.Printf("%q: request -> response took %v", url, dt)
}
func main() {
flag.Parse()
// Use the specific version of x/net/http2 in go.mod rather
// than the one bundled with the go compiler
t := http.DefaultTransport.(*http.Transport)
t.TLSNextProto = nil
err := http2.ConfigureTransport(t)
if err != nil {
log.Fatal(err)
}
for _, arg := range flag.Args() {
get(arg)
get(arg)
get(arg)
get(arg)
get(arg)
}
}
If you run this at x/net cedda3a722ddc29587099e7176bb992c7c072a60
with the parameter https://rclone-test2.s3.fr-par.scw.cloud/hello.txt
(which is a small public object I made) you get about 16 seconds for the first response and 5 seconds thereafter.
$ ./go-issue-53253 https://rclone-test2.s3.fr-par.scw.cloud/hello.txt
2022/06/07 16:22:16 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:22:16 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 16.680566276s
2022/06/07 16:22:21 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:22:21 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 5.094247611s
2022/06/07 16:22:26 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:22:26 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 5.170931364s
2022/06/07 16:22:32 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:22:32 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 5.189671016s
2022/06/07 16:22:37 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:22:37 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 5.085239191s
Whereas if you run it at x/net cedda3a722ddc29587099e7176bb992c7c072a60~1
you get about 400 mS for the first request and 75 mS for each subsequent one.
./go-issue-53253 https://rclone-test2.s3.fr-par.scw.cloud/hello.txt
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 408.18085ms
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 62.775752ms
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 77.093739ms
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 82.372166ms
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": read OK
2022/06/07 16:23:33 "https://rclone-test2.s3.fr-par.scw.cloud/hello.txt": request -> response took 83.287918ms
Hopefully this is enough info for you to reproduce and investigate.
Just turning on GODEBUG highlights the difference:
require golang.org/x/net v0.0.0-20210929193557-e81a3d93ecf6
2022/06/07 18:56:44 http2: Framer 0xc000477ea0: read HEADERS flags=END_HEADERS stream=1 len=310
...
2022/06/07 18:56:44 http2: Transport received HEADERS flags=END_HEADERS stream=1 len=310
require golang.org/x/net v0.0.0-20220607020251-c690dde0001d
2022/06/07 18:45:16 http2: Transport received HEADERS flags=END_HEADERS stream=1 len=310
2022/06/07 18:45:21 http2: Framer 0xc00047fce0: read DATA flags=END_STREAM stream=1 len=0 data=""
2022/06/07 18:45:21 http2: Transport received DATA flags=END_STREAM stream=1 len=0 data=""
It looks more like the current http2 code is correct and expects an END_STREAM on the final HEADERS frame which isn't happening.
When I upgraded to go1.18.3, I also had problems with tls speed, but go1.18.2 was fine
Thanks for the nice reproduction case.
It looks like the problem is, as @fraenkel observes, that the client is waiting for an END_STREAM that the server isn't sending. Since this is a HEAD request, and responses to HEAD requests aren't allowed to contain a body, there's no reason for the server not to immediately send an END_STREAM. I don't know why it isn't.
The Go HTTP/2 client is waiting for the stream to be cleaned up before returning from RoundTrip
:
https://go.googlesource.com/net/+/refs/heads/master/http2/transport.go#1164
I don't remember exactly why this wait is here. It does ensure that in the simple case of a request and response with no body, the request stream has been cleaned up before RoundTrip
returns. However, closing the response body also serves this purpose (Response.Body.Close
blocks until the stream has been cleaned up).
Possible options:
RoundTrip
. This seems like it should be correct, but I'm worried that I might be forgetting something.I'm leaning towards the second option (reset the stream).
Change https://go.dev/cl/411474 mentions this issue: http2: don't wait for END_STREAM after reading HEAD response headers
I just noticed curl
does exactly the same thing with a 5 second delay
$ time curl --http2 -X HEAD -vvv https://rclone-test2.s3.fr-par.scw.cloud/hello.txt
[snip TLS negotiation, then 5 second pause]
< HTTP/2 200
< content-length: 6
< x-amz-id-2: txc30a9b01c3604793bdaf1-0062a32862
< x-amz-meta-mtime: 1654615037.739069812
< last-modified: Tue, 07 Jun 2022 15:17:18 GMT
< etag: "b1946ac92492d2347c6235b4d2611184"
< x-amz-request-id: txc30a9b01c3604793bdaf1-0062a32862
< x-amz-version-id: 1654615038095033
< content-type: text/plain; charset=utf-8
< accept-ranges: bytes
< date: Fri, 10 Jun 2022 11:17:54 GMT
<
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* transfer closed with 6 bytes remaining to read
* stopped the pause stream!
* Connection #0 to host rclone-test2.s3.fr-par.scw.cloud left intact
curl: (18) transfer closed with 6 bytes remaining to read
real 0m5.182s
user 0m0.027s
sys 0m0.011s
$ echo $?
18
$ curl -V
curl 7.81.0 (x86_64-pc-linux-gnu) libcurl/7.81.0 OpenSSL/3.0.2 zlib/1.2.11 brotli/1.0.9 zstd/1.4.8 libidn2/2.3.2 libpsl/0.21.0 (+libidn2/2.3.2) libssh/0.9.6/openssl/zlib nghttp2/1.43.0 librtmp/2.3 OpenLDAP/2.5.11
Release-Date: 2022-01-05
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets zstd
This seems like it might be a bug in curl though as the debug says it is waiting for 6 more bytes (the content-length
) which should not be arriving since this was a HEAD request. The server clearly didn't close the stream so maybe it is arguable.
I'll try reporting this to Scaleway also now I have a curl repro.
Interesting, --head completes in .763s and -X hEAD finishes in 5s.
I think the -X HEAD
results are spurious - they are there with AWS S3 also. Scaleway support pointed out this bug report on curl https://sourceforge.net/p/curl/bugs/694/ - using --head
does not have that 5 second delay.
I made a similar object on AWS s3 if you want to compare https://rclone-2.s3.eu-west-2.amazonaws.com/hello.txt
So I don't think you can reproduce the problem with curl
meaning that whatever infelicities the Scaleway servers may or may not have, curl has a workaround for them, which probably means the Go http2 library should too.
@neild What is the current status here? This issue is currently in the 1.19 milestone. Should it move to 1.20? To Backlog? Thanks.
This can move to 1.20; the condition has existed since 1.18 and is limited to interactions with buggy (IMO) servers. We should still put in a workaround.
Change https://go.dev/cl/415454 mentions this issue: http2: don't wait for streams to close in RoundTrip
It's not obvious to me what the right fix here is, or if we should change anything at all in the Go HTTP/2 client.
I tested a few major websites to see how they respond to HTTP/2 HEAD requests. www.google.com sends a combined END_HEADERS+END_STREAM frame, but www.apache.org, www.amazon.com, and www.microsoft.com all send a frame with END_HEADERS immediately followed by a frame with END_STREAM. Nobody I can find (aside from Scaleway) delays between END_HEADERS and END_STREAM, which I still think seems like a clear bug. (AWS S3 doesn't appear to directly support HTTP/2, by the way, so I can't directly compare it.)
It is important that a request stream be closed by the time we return from Response.Body.Close
.
Things we could do:
Response.Body.Close
if the stream is still active. This requires the user to realize they need to call Response.Body.Close
for HEAD responses, even though the body will always be empty, and will still generate unnecessary stream resets.Regarding curl:
I haven't looked into curl's source to see exactly what it's doing, but it looks like curl --head
reads response headers and immediately exits without waiting for the stream to complete. This is reasonable behavior for a program which performs a single HTTP request and terminates. net/http
, in contrast, can perform many requests on a single HTTP/2 connection.
An HTTP/2 connection can support some number of simultaneous streams, where each stream corresponds to an HTTP request. HTTP/2 clients must take care not to exceed the maximum number of streams permitted on a connection.
An HTTP/2 stream ends either when both endpoints send a frame with the END_STREAM
flag set, or when either endpoint sends a RST_STREAM
frame.
Scaleway's HTTP/2 server is responding to HEAD requests by sending response headers, pausing for several seconds, and finally sending a frame with the END_STREAM
flag set. This leaves the stream open until the END_STREAM
is sent. A client talking to this server needs to count the stream against the connection stream limit until the stream is closed. Sending many HEAD requests in a short interval will either require blocking until old streams are closed or creating a new connection; neither of these options are optimal.
A client could note that the server has finished responding to the HEAD request without closing the stream and send a RST_STREAM
frame to close the stream. However, this is the client working around the server's failure to close the stream. Note that the client has already sent its own END_STREAM
at this point, so the server knows the client isn't going to be sending more data. And choosing to send a RST_STREAM
in this case makes for a worse experience when communicating with non-buggy servers that close streams in a timely fashion.
The closest thing we could do to matching curl's behavior is to return immediately from RoundTrip
without waiting for the stream to close, and to then wait in Response.Body.Close
. A program which sends a HEAD request and does not close the response body will then behave like curl, reading the response headers and exiting immediately with the stream left open. However, this behavior won't help programs that send many HEAD requests before exiting; either they close response bodies and block there, or don't close response bodies and rapidly consume all available stream quota on the connection.
So I don't think it's accurate to say that curl has a workaround, and it is not feasible for net/http
to mimic curl's behavior.
Hello :wave: So indeed it looks like there is an "issue" on our (Scaleway) S3 endpoint. I'll take a look and see what can be done. However, a RST_STREAM
could be sent when the headers arrives (and doesnt have the END_STREAM
flag) on HEAD
(even 204/304 return codes), as it's guarantee to not have any data sent after. Not familiar with go http2 internals, but it might be the fastest (and not that illogic) solution?
The problem with sending a RST_STREAM
when headers arrive with no END_STREAM
is that we'd end up sending a RST_STREAM
for most HEAD requests, because most of the servers I've tested send the END_HEADERS
and END_STREAM
in different frames. That'd be unnecessary network chatter at best.
True! I might have a fix though, still need to dig a bit but it should fix this issue. Can't really say for sure when it'll go to production, but hopefully in the next 2 months!
Do we think this could be the root cause for this nasty gRPC bug[1]?
[1] https://github.com/grpc/grpc-go/issues/5358
cc @neild
Do we think this could be the root cause for this nasty gRPC bug[1]?
AFAIK, the Go gRPC implementation shares no code with net/http
or x/net/http2
, so that seems unlikely.
Hello! This is now fixed on Scaleway's S3 ~(on fr-par and pl-waw endpoints, nl-ams will come in the next weeks)~ It's deployed everywhere now.
nghttp -vvv -H':method: HEAD' https://s3.fr-par.scw.cloud | tail -n 5
; END_STREAM | END_HEADERS
(padlen=0)
; First response header
[ 0.045] send GOAWAY frame <length=8, flags=0x00, stream_id=0>
(last_stream_id=0, error_code=NO_ERROR(0x00), opaque_data(0)=[])
I think the issue can be closed!
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
A user reported a performance regression between rclone v1.57 and v1.58 in the forum.
I ascertained that this wasn't a regression in rclone, but turned out to be a regression in the go standard library from
go1.17
togo1.18
. I managed to bisect the problem to this commit:7109323af5a8c7e076fa2af7185caa6185df97cd is the first bad commit from #48564 and #23559
It turns out that this http2 upgrade tanked the performance of rclone with this HTTP2 server. I haven't measured exactly the performance but it seems at least 10 times slower per file uploaded by rclone. (Edit it seems to add 5 seconds of latency to each HTTP roundtrip - see later).
I'm not really sure why though, and I need some help investigating further. I tried turning on
GODEBUG=http2debug=2
and I couldn't see anything obvious (no error messages). I can post these, but I need to sanitize tokens out of them first.I could bisect the
golang.org/x/net
bundle commit further, but I don't know how to gluex/net
into the go source - a hint here would be appreciated too!