golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.88k stars 17.52k forks source link

net/http: bundled x/net/http2.responseWriter hangs forever #42534

Open dtelyukh opened 3 years ago

dtelyukh commented 3 years ago

What version of Go are you using (go version)?

go version go1.15.4 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/dionysius/.cache/go-build"
GOENV="/home/dionysius/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/dionysius/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/dionysius/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build977072123=/tmp/go-build -gno-record-gcc-switches"

What did you do?

So, let's begin

High-Level Problem Description I'm creating reverse proxy with Caddy Server and my own plugin. I use http/2 and Server Push. Sometimes requests hang forever. Here is screenshot from Chrome DevTools: scr2020-11-12 15-02-06

Low-Level Problem Description So, I started to debug this situation. I found that my code execution stuck at (http.responseWriter).Write(), which is an instance of http2responseWriter. With help of pprof I found that lockup happens in two functions: http2serverConn.writeHeaders and http2serverConn.writeDataFromHandler - endless waiting of data from done channel. Here is an illustration from pprof: out02

Next I built go from source with adding some debug messages and start to dive deeper. I found a problem with frames are sent to output. At this line N frames were pushed: https://github.com/golang/go/blob/go1.15.4/src/net/http/h2_bundle.go#L4692. After push-function scheduleFrameWrite-function is called. I watched into it and found that it often exit here: https://github.com/golang/go/blob/go1.15.4/src/net/http/h2_bundle.go#L4817. And only M (M < N) frames were popped from queue here: https://github.com/golang/go/blob/go1.15.4/src/net/http/h2_bundle.go#L4837

Pushed Frames
2020/11/12 12:12:35 push writeFrame for streamID=0 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=26 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=26 size=15169
2020/11/12 12:12:35 push writeFrame for streamID=28 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=26 size=0
2020/11/12 12:12:35 push writeFrame for streamID=28 size=5566
2020/11/12 12:12:35 push writeFrame for streamID=30 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=28 size=0
2020/11/12 12:12:35 push writeFrame for streamID=30 size=77162
2020/11/12 12:12:35 push writeFrame for streamID=32 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=32 size=354
2020/11/12 12:12:35 push writeFrame for streamID=30 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=34 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=34 size=483
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=38 size=0
2020/11/12 12:12:35 push writeFrame for streamID=36 size=0
2020/11/12 12:12:35 push writeFrame for streamID=38 size=27485
2020/11/12 12:12:35 push writeFrame for streamID=40 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=36 size=6726
2020/11/12 12:12:35 push writeFrame for streamID=40 size=6293
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=42 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=44 size=0
2020/11/12 12:12:35 push writeFrame for streamID=42 size=11249
2020/11/12 12:12:35 push writeFrame for streamID=46 size=0
2020/11/12 12:12:35 push writeFrame for streamID=40 size=0
2020/11/12 12:12:35 push writeFrame for streamID=48 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=44 size=9293
2020/11/12 12:12:35 push writeFrame for streamID=46 size=2626
2020/11/12 12:12:35 push writeFrame for streamID=48 size=346
2020/11/12 12:12:35 push writeFrame for streamID=119 size=6381
2020/11/12 12:12:35 push writeFrame for streamID=42 size=0
2020/11/12 12:12:35 push writeFrame for streamID=44 size=0
2020/11/12 12:12:35 push writeFrame for streamID=36 size=0
2020/11/12 12:12:35 push writeFrame for streamID=119 size=0
2020/11/12 12:12:35 push writeFrame for streamID=38 size=0
2020/11/12 12:12:36 push writeFrame for streamID=121 size=0
2020/11/12 12:12:36 push writeFrame for streamID=121 size=1021
2020/11/12 12:12:36 push writeFrame for streamID=123 size=0
2020/11/12 12:12:36 push writeFrame for streamID=125 size=0
2020/11/12 12:12:36 push writeFrame for streamID=127 size=0
2020/11/12 12:12:36 push writeFrame for streamID=129 size=0
2020/11/12 12:12:36 push writeFrame for streamID=131 size=0
2020/11/12 12:12:36 push writeFrame for streamID=133 size=0
2020/11/12 12:12:36 push writeFrame for streamID=135 size=0
2020/11/12 12:12:36 push writeFrame for streamID=137 size=0
2020/11/12 12:12:36 push writeFrame for streamID=139 size=0
2020/11/12 12:12:36 push writeFrame for streamID=141 size=0
2020/11/12 12:12:36 push writeFrame for streamID=143 size=0
2020/11/12 12:12:36 push writeFrame for streamID=145 size=0
2020/11/12 12:12:36 push writeFrame for streamID=147 size=0
2020/11/12 12:12:36 push writeFrame for streamID=149 size=0
2020/11/12 12:12:36 push writeFrame for streamID=151 size=0
2020/11/12 12:12:36 push writeFrame for streamID=153 size=0
2020/11/12 12:12:36 push writeFrame for streamID=155 size=0
2020/11/12 12:12:36 push writeFrame for streamID=157 size=0
2020/11/12 12:12:36 push writeFrame for streamID=159 size=0
2020/11/12 12:12:36 push writeFrame for streamID=161 size=0
2020/11/12 12:12:36 push writeFrame for streamID=163 size=0
2020/11/12 12:12:36 push writeFrame for streamID=165 size=0
2020/11/12 12:12:36 push writeFrame for streamID=167 size=0
2020/11/12 12:12:36 push writeFrame for streamID=169 size=0
2020/11/12 12:12:36 push writeFrame for streamID=171 size=0
2020/11/12 12:12:36 push writeFrame for streamID=173 size=0
2020/11/12 12:12:36 push writeFrame for streamID=175 size=0
2020/11/12 12:12:36 push writeFrame for streamID=201 size=0
2020/11/12 12:12:36 push writeFrame for streamID=179 size=0
2020/11/12 12:12:36 push writeFrame for streamID=181 size=0
2020/11/12 12:12:36 push writeFrame for streamID=183 size=0
2020/11/12 12:12:36 push writeFrame for streamID=185 size=0
2020/11/12 12:12:36 push writeFrame for streamID=187 size=0
2020/11/12 12:12:36 push writeFrame for streamID=207 size=0
2020/11/12 12:12:36 push writeFrame for streamID=203 size=0
2020/11/12 12:12:36 push writeFrame for streamID=209 size=0
2020/11/12 12:12:36 push writeFrame for streamID=205 size=0
2020/11/12 12:12:36 push writeFrame for streamID=189 size=0
2020/11/12 12:12:36 push writeFrame for streamID=213 size=0
2020/11/12 12:12:36 push writeFrame for streamID=211 size=0
2020/11/12 12:12:36 push writeFrame for streamID=221 size=0
2020/11/12 12:12:36 push writeFrame for streamID=193 size=0
2020/11/12 12:12:36 push writeFrame for streamID=191 size=0
2020/11/12 12:12:36 push writeFrame for streamID=197 size=0
2020/11/12 12:12:36 push writeFrame for streamID=199 size=0
2020/11/12 12:12:36 push writeFrame for streamID=195 size=0
2020/11/12 12:12:36 push writeFrame for streamID=215 size=0
2020/11/12 12:12:36 push writeFrame for streamID=217 size=0
2020/11/12 12:12:36 push writeFrame for streamID=223 size=0
2020/11/12 12:12:36 push writeFrame for streamID=219 size=0
2020/11/12 12:12:36 push writeFrame for streamID=225 size=0
2020/11/12 12:12:36 push writeFrame for streamID=227 size=0
2020/11/12 12:12:36 push writeFrame for streamID=123 size=3464
2020/11/12 12:12:36 push writeFrame for streamID=225 size=312351
2020/11/12 12:12:36 push writeFrame for streamID=225 size=0
2020/11/12 12:12:36 push writeFrame for streamID=229 size=0
2020/11/12 12:12:36 push writeFrame for streamID=231 size=0
2020/11/12 12:12:37 push writeFrame for streamID=177 size=0
Popped Frames
2020/11/12 12:12:35 sched pop writeFrame for streamID=0 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=26 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=26 size=15169
2020/11/12 12:12:35 sched pop writeFrame for streamID=28 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=26 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=28 size=5566
2020/11/12 12:12:35 sched pop writeFrame for streamID=30 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=28 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=30 size=16384
2020/11/12 12:12:35 sched pop writeFrame for streamID=30 size=16384
2020/11/12 12:12:35 sched pop writeFrame for streamID=30 size=16384
2020/11/12 12:12:35 sched pop writeFrame for streamID=30 size=16384
2020/11/12 12:12:35 sched pop writeFrame for streamID=30 size=11626
2020/11/12 12:12:35 sched pop writeFrame for streamID=32 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=32 size=354
2020/11/12 12:12:35 sched pop writeFrame for streamID=30 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=34 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=38 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=34 size=483
2020/11/12 12:12:35 sched pop writeFrame for streamID=36 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=40 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=38 size=16384
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=40 size=6293
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=42 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=44 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=42 size=11249
2020/11/12 12:12:35 sched pop writeFrame for streamID=46 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=48 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=44 size=9293
2020/11/12 12:12:35 sched pop writeFrame for streamID=48 size=346
2020/11/12 12:12:35 sched pop writeFrame for streamID=46 size=2626
2020/11/12 12:12:35 sched pop writeFrame for streamID=36 size=6726
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=6381
2020/11/12 12:12:35 sched pop writeFrame for streamID=38 size=11101
2020/11/12 12:12:35 sched pop writeFrame for streamID=40 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=42 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=44 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=36 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=119 size=0
2020/11/12 12:12:35 sched pop writeFrame for streamID=38 size=0
2020/11/12 12:12:36 sched pop writeFrame for streamID=121 size=0
2020/11/12 12:12:36 sched pop writeFrame for streamID=121 size=1021
2020/11/12 12:12:36 sched pop writeFrame for streamID=123 size=0
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=0
2020/11/12 12:12:36 sched pop writeFrame for streamID=123 size=3464
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=16384
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=1055
2020/11/12 12:12:36 sched pop writeFrame for streamID=225 size=0

What did you expect to see?

No lockups.

What did you see instead?

Random lockups.

dtelyukh commented 3 years ago

Possible related issue https://github.com/golang/go/issues/23559

networkimprov commented 3 years ago

cc @fraenkel

fraenkel commented 3 years ago

@dtelyukh We are going to need something that can reproduce the issue. It would also help to enable http2 debug and a thread dump when it hangs.

fraenkel commented 3 years ago

Your title says the bundled version of http2 has this issue. Are you implying that if you use the latest x/net/http2 you dont?

dtelyukh commented 3 years ago

@dtelyukh We are going to need something that can reproduce the issue. It would also help to enable http2 debug and a thread dump when it hangs.

It's not easy to prepare code, which can reproduce this problem with 100% guarantee, but I think I could try. Here is debug.log for GODEBUG=http2debug=2 http2.debug.log and goroutines dump with pprof goroutine.dump.log

Your title says the bundled version of http2 has this issue. Are you implying that if you use the latest x/net/http2 you dont?

h2_bundle.go used by third-party code. I didn't try to use x/net/http2 directly. Do you mean that I should do that?

fraenkel commented 3 years ago

Don't worry. I am going to need something that can reproduce this issue. I can see why nothing is making progress but I don't know why. The debug log is incomplete or slightly broken, but from what I do see there is an oddity.

2020/11/13 10:44:00 http2: Framer 0xc000af61c0: read HEADERS flags=END_STREAM|END_HEADERS|PRIORITY stream=1511 len=21
2020/11/13 10:44:00 http2: Framer 0xc000af61c0: wrote PUSH_PROMISE flags=END_HEADERS stream=1511 len=293
2020/11/13 10:44:00 http2: Framer 0xc000af61c0: read PRIORITY stream=314 len=5
2020/11/13 10:44:00 http2: Framer 0xc000af61c0: wrote PUSH_PROMISE flags=END_HEADERS stream=1511 len=33
2020/11/13 10:44:00 http2: Framer 0xc000af61c0: wrote HEADERS flags=END_HEADERS stream=314 len=120
2020/11/13 10:44:00 http2: Framer 0xc000af61c0: wrote PUSH_PROMISE flags=END_HEADERS stream=1511 len=44
2020/11/13 10:44:00 http2: Framer 0xc000af61c0: wrote HEADERS flags=END_HEADERS stream=316 len=115

Notice the stream for the PUSH is 1511 but the above is the first time I see that Framer. And the rest are in the 300s. I don't exactly see how this happened. There are multiple PUSH_PROMISE all with the same stream id which is also odd.

dtelyukh commented 3 years ago

I truncated log-file after each successful request until it was hanged. Maybe that is why the log-file was broken.

I attach here other log-file, which was made when I caught problem from the first time. This log-file was never truncated. http2.debug.log

dtelyukh commented 3 years ago

@fraenkel, we prepared test application for problem reproducing. My apologies for so complicated app. We cannot extract some small piece of code, because we don't know where is the problem exactly. I sent credentials to michael.fraenkel@gmail.com.

  1. Point Chromium-based browser to https://cardonecapital.hc04.dorofeev.me/.
  2. Press F12 and check "Disable cache" box.
  3. Press F5 until request will hangs.

To have more chances to catch the problem it should to remove proxy cache:

  1. Stop Caddy Server kill -SIGTERM <caddy process id>
  2. rm -fR /home/user/caddy-cache
  3. ./caddy run&

To patch or debug server:

  1. Custom cache plugin code is here: /home/user/smart-cache
  2. Caddy Server code is here: /home/user/go/src/github.com/caddyserver/caddy
  3. To rebuild Caddy:
    • cd /home/user/go/src/github.com/caddyserver/caddy/cmd/caddy
    • CGO_ENABLED=0 go build
    • mv ./caddy ~/caddy
    • cd
    • sudo setcap 'cap_net_bind_service=+ep' ./caddy
    • kill -SIGTERM <caddy process id>
    • ./caddy run&
fraenkel commented 3 years ago

@dtelyukh I did find a way to cause the hang locally, from my machine it would never happen. for i in {1..1000}; do echo $i; nghttp https://cardonecapital.hc04.dorofeev.me/ -n; done would eventually hang. Once I attempted to compile a new Go, I could never get caddy to rebuild. I always ended up with qtls init failure. Attempting to fix that, I didn't realize at the time that the src tree was a bit special so I can no longer make any progress since I cannot download your smart-cache module. If you could fix the tree, at least next time I know to make a copy of the entire tree before doing anything. I am a bit concerned that a simple go mod tidy prevented any further compilation of caddy.

fraenkel commented 3 years ago

Never mind, I got it working again....

fraenkel commented 3 years ago

So one thing I did verify is that using the latest golang/x/net/http2 code does not cause the hang I see with my simple testcase.

dtelyukh commented 3 years ago

@fraenkel, what can I help?

fraenkel commented 3 years ago

You can see the one line change I made to caddyserver with a go mod tidy. See if this new version hangs for you as well.

dtelyukh commented 3 years ago

@fraenkel, this new version is never hangs for us. And also we noticed that app become faster. 👍 Thank you!

dtelyukh commented 3 years ago

Full page load time (with all resources) is 2% less than with old http/2, and median absolute deviation is 1% less too. So it's both faster, and shows more stable performance.

dtelyukh commented 3 years ago

@fraenkel, should I close this ticket?

fraenkel commented 3 years ago

yes, given there is a solution and this should be fixed in 1.16 although one should verify that is true.

dtelyukh commented 3 years ago

@fraenkel, this bug is still exist. But we found more clear way to reproduce it.

An issue in Caddy's repository: https://github.com/caddyserver/caddy/issues/3896

How to reproduce

It depends on the proxied website and caddy config, and some random factors, thus it occurs with different frequency on different hardware. The steps are:

  1. Add this line to /etc/hosts 127.0.0.1 terem-pro.localhost
  2. Clone Caddy repository git clone https://github.com/caddyserver/caddy.git
  3. Build binary from master (a patch with the version from Nov 10, 2020 of x/net is already included) cd caddy/cmd/caddy && go build
  4. sudo setcap 'cap_net_bind_service=+ep' ./caddy
  5. Create Caddyfile with this content

    https://terem-pro.localhost {                                                 
    handle {                                                                                                                                                           
        reverse_proxy https://www.terem-pro.ru {
            header_up host {http.reverse_proxy.upstream.host}
        }                                
    
        push / {                         
            /local/components/terem/catalog.list/templates/index.best.seller/style.css
            /local/components/terem/new_services.content/templates/home.banner.lots/style.css
            /local/components/terem/slider.blocks/templates/slider.useful/style.css
            /local/components/terem/standard.blocks/templates/call.action.white/style.css
            /local/components/terem/review.list/templates/carousel.home/style.css
            /local/components/terem/standard.blocks/templates/promo.red.home/style.css
            /local/components/terem/promotion.list/templates/home.slider/style.css
            /local/components/terem/form.form/templates/template.pdf/style.css
            /local/templates/terem/components/bitrix/menu/template.header.menu.top/desktop-menu.css
            /local/components/terem/form.form/templates/template.taxi/style.css
            /assets/resources/css/home.css
            /local/templates/terem/components/bitrix/menu/template.header.menu-mobile/style_menu.css
            /local/components/terem/catalog.type.list/templates/.default/style.css
            /assets/resources/css/styles.css
            /bitrix/cache/css/s1/terem/template_ad73b02503569e1113abf0b013fdbb28/template_ad73b02503569e1113abf0b013fdbb28_v1.css?16067202133580
            /bitrix/cache/css/s1/terem/page_074396ca6d41424fe878cb365c109aa1/page_074396ca6d41424fe878cb365c109aa1_v1.css?160672023225970
        }
    }
    }
  6. Run Caddy server ./caddy run
  7. Open developer tools network tab, to visually see the hangs; checking the "disable cache" toggle will help reproduce the problem faster, but is not necessary
  8. Navigate to / page in a browser (i.e., https://terem-pro.localhost)
  9. Wait till it fully loads
  10. If it didn't hang on step 8 - hit f5, and again, wait till it fully loads; repeat several times if needed
  11. On our test server, it usually hangs after 2-3 reloads. On some devices, it might require 10-15 attempts but still hangs at some point.
fraenkel commented 3 years ago

I reproduced a hang but its using the bundle http2 stack.

goroutine 5252 [select, 3 minutes]:
net/http.(*http2serverConn).writeHeaders(0xc000582900, 0xc000feec60, 0xc0003c93b0, 0xa, 0x24a6800)
    /snap/go/6745/src/net/http/h2_bundle.go:5753 +0x172
net/http.(*http2responseWriterState).writeChunk(0xc000be7200, 0xc000fc1000, 0xdf, 0x1000, 0xc000fcfdf0, 0x1, 0xc0009a8180)
    /snap/go/6745/src/net/http/h2_bundle.go:6020 +0x3bf
dtelyukh commented 3 years ago

Write me, please, to fix it should bundle a recent version of http2-library? Or does this library itself need to be fixed? Is there a some workaround? Should we use http2-library directly?

dtelyukh commented 3 years ago

Looks like with this hack it never hangs http2.ConfigureServer(s, nil) With it bundled version is not use, doesn't it?

dtelyukh commented 3 years ago

The solution is to use explicitly x/net/http2.

dtelyukh commented 3 years ago

The problem is not solved neither by 1.16rc1, nor by 1.15.8. Easy steps for reproducing are here: https://github.com/caddyserver/caddy/issues/3896

WIthout explicit usage of /x/net/http2 http2.ConfigureServer(s, nil) hangs still happen.

ReillyTevera commented 3 years ago

I was able to reproduce this with the reverse proxy Traefik, and can confirm that directly calling the /x/net/http2 ConfigureServer method fixes this issue. I can also confirm that this is not fixed in the golang 1.16.0 stdlib, @fraenkel any idea when we can expect the fix to make its way there?

networkimprov commented 3 years ago

cc @toothrot @dmitshur re possible release issue

dmitshur commented 3 years ago

CC @bradfitz, @tombergan, @rsc, @empijei via owners. Also CC @neild.

geropl commented 3 years ago

:wave: Any update on this issue? Any idea how and when this might be resolved?

dtelyukh commented 2 years ago

I hope it will help: https://github.com/golang/go/issues/45435

ReillyBrogan commented 2 years ago

I believe this is fixed with Go 1.17 considering that the bundled http2 library was updated to a snapshot of x/net/http2 from May of this year.

@dtelyukh Can you give 1.17 a try with your test case (without manually using x/net/http2)?

dtelyukh commented 2 years ago

@ReillyBrogan , I tried 1.17.3 - the problem still exists. I think, 1.18 could help.

bakape commented 2 years ago

Potentially fixed with https://github.com/golang/go/issues/49921. It's part of the Go 1.17.6 release.

dtelyukh commented 2 years ago

This is still an issue on Go 1.18

whitewindmills commented 1 year ago

I had the same problem using Go 1.19.

divanodestiny commented 4 months ago

any progress now?i had the same problems with golang.org/x/net v0.0.0-20210813160813-60bc85c4be6d in branch internal-branch.go1.18-vendor

Rohsichan commented 1 week ago

I had the same problem using Go 1.22.6