Closed agui2200 closed 1 year ago
Yeah, I think I found the same problem. Was testing a proxy which uses this library:
goroutine 1298 [IO wait, 9 minutes]:
internal/poll.runtime_pollWait(0x7f6072fcaad0, 0x72)
/home/jsn/sdk/go1.19.3/src/runtime/netpoll.go:305 +0x89
internal/poll.(*pollDesc).wait(0xc00105f480?, 0xc001468000?, 0x0)
/home/jsn/sdk/go1.19.3/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
/home/jsn/sdk/go1.19.3/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc00105f480, {0xc001468000, 0xa000, 0xa000})
/home/jsn/sdk/go1.19.3/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc00105f480, {0xc001468000?, 0xc000c45000?, 0xc00146ffc3?})
/home/jsn/sdk/go1.19.3/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc0001342a0, {0xc001468000?, 0xc001124828?, 0xc001124808?})
/home/jsn/sdk/go1.19.3/src/net/net.go:183 +0x45
github.com/bogdanfinn/utls.(*atLeastReader).Read(0xc00115d5f0, {0xc001468000?, 0x0?, 0x4a5908?})
/home/jsn/go/pkg/mod/github.com/bogdanfinn/utls@v1.5.15/conn.go:791 +0x3d
bytes.(*Buffer).ReadFrom(0xc0011322b0, {0xa60940, 0xc00115d5f0})
/home/jsn/sdk/go1.19.3/src/bytes/buffer.go:202 +0x98
github.com/bogdanfinn/utls.(*Conn).readFromUntil(0xc001132000, {0xa60be0?, 0xc0001342a0}, 0x7f607b9c1fff?)
/home/jsn/go/pkg/mod/github.com/bogdanfinn/utls@v1.5.15/conn.go:813 +0xe5
github.com/bogdanfinn/utls.(*Conn).readRecordOrCCS(0xc001132000, 0x0)
/home/jsn/go/pkg/mod/github.com/bogdanfinn/utls@v1.5.15/conn.go:620 +0x116
github.com/bogdanfinn/utls.(*Conn).readRecord(...)
/home/jsn/go/pkg/mod/github.com/bogdanfinn/utls@v1.5.15/conn.go:586
github.com/bogdanfinn/utls.(*Conn).Read(0xc001132000, {0xc00117b000, 0x1000, 0xc000dea060?})
/home/jsn/go/pkg/mod/github.com/bogdanfinn/utls@v1.5.15/conn.go:1300 +0x16f
bufio.(*Reader).Read(0xc000dea000, {0xc000c19df8, 0x9, 0xc000c78ce8?})
/home/jsn/sdk/go1.19.3/src/bufio/bufio.go:237 +0x1bb
io.ReadAtLeast({0xa60360, 0xc000dea000}, {0xc000c19df8, 0x9, 0x9}, 0x9)
/home/jsn/sdk/go1.19.3/src/io/io.go:332 +0x9a
io.ReadFull(...)
/home/jsn/sdk/go1.19.3/src/io/io.go:351
github.com/bogdanfinn/fhttp/http2.readFrameHeader({0xc000c19df8?, 0x9?, 0xc00139e390?}, {0xa60360?, 0xc000dea000?})
/home/jsn/go/pkg/mod/github.com/bogdanfinn/fhttp@v0.5.19/http2/frame.go:238 +0x6e
github.com/bogdanfinn/fhttp/http2.(*Framer).ReadFrame(0xc000c19dc0)
/home/jsn/go/pkg/mod/github.com/bogdanfinn/fhttp@v0.5.19/http2/frame.go:494 +0x95
github.com/bogdanfinn/fhttp/http2.(*clientConnReadLoop).run(0xc001124fa0)
/home/jsn/go/pkg/mod/github.com/bogdanfinn/fhttp@v0.5.19/http2/transport.go:2020 +0x165
github.com/bogdanfinn/fhttp/http2.(*ClientConn).readLoop(0xc00114e1a0)
/home/jsn/go/pkg/mod/github.com/bogdanfinn/fhttp@v0.5.19/http2/transport.go:1942 +0x79
created by github.com/bogdanfinn/fhttp/http2.(*Transport).newClientConn
/home/jsn/go/pkg/mod/github.com/bogdanfinn/fhttp@v0.5.19/http2/transport.go:821 +0xc4a
There are many goroutines which keep running and consume server resources:
They are closed in 10m or so however. Is there a way to decrease this period? :thinking:
@bogdanfinn I have prepared a fix in forked repositories (fhttp and tls-client). To explain it, it does the following:
If the changes seem reasonable for you, I could create prs
@jsnjack thank you for contributing. i will take a look at the changes.
@agui2200 does one of you maybe have an example how to reproduce this issue?
Sure, I think the following should be enough. This example opens 15 https urls (http works fine):
package main
import (
"fmt"
"net/http"
_ "net/http/pprof"
fhttp "github.com/bogdanfinn/fhttp"
tls_client "github.com/bogdanfinn/tls-client"
)
var urls = []string{
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/0ad-data-0.0.25b-bp154.1.17.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/2ping-4.5.1-bp154.1.28.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/4pane-lang-7.0-bp154.1.87.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/AppStream-lang-0.15.1-150400.1.5.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/Botan-doc-2.18.2-bp154.1.30.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/CSXCAD-matlab-0.6.2-bp154.2.22.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/ClusterTools2-3.1.2-150100.8.9.1.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/Crystalcursors-0.5-bp154.1.22.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/DVDStyler-lang-3.1.2-bp154.1.53.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/FlightGear-data-2020.3.11-bp154.1.14.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/FlightGear-docs-2020.3.11-bp154.1.14.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/GeoIP-data-1.6.12-6.3.1.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/ImageMagick-doc-7.1.0.9-150400.4.7.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/J7Z-1.4.2-bp154.1.48.noarch.rpm",
"https://ftp.lysator.liu.se/pub/opensuse/distribution/leap/15.4/repo/oss/noarch/J7Z-kf5-1.4.2-bp154.1.48.noarch.rpm",
}
func main() {
for _, item := range urls {
client, _ := tls_client.NewHttpClient(tls_client.NewNoopLogger())
freq, _ := fhttp.NewRequest("GET", item, nil)
resp, _ := client.Do(freq)
fmt.Println(item, resp.Status)
resp.Body.Close()
}
http.ListenAndServe("localhost:6060", nil)
}
The example setups pprof endpoint on 6060. You can see hanging goroutines here http://127.0.0.1:6060/debug/pprof/goroutine?debug=1
Doing something like ls -l /proc/<pid>/fd/ | grep socket | wc -l
will show you open sockets for the process
@bogdanfinn I have prepared a fix in forked repositories (fhttp and tls-client). To explain it, it does the following:
- Adds IdleConnTimeout field to Transport struct. Use it for both http2 and http1 requests bogdanfinn/fhttp@f58254f
- Expose IdleConnTimeout to transport options 54921c8 and 6618950
If the changes seem reasonable for you, I could create prs
Added just on comment on your changeset. Let me know what you think about it. Then you can create prs if you like :-)
@jsnjack
Oh I see that you already have everything in order here https://github.com/bogdanfinn/tls-client/commit/e7f07b8eaef4588af9e714f82cd78d6b88dce165
I think it might be simpler for both of us if you just use this commit :) Updating module dependencies is no fun! :D
@jsnjack i was using your suggested fix yesterday to reproduce and debug this issue yes.
i will implement your comment you wrote on my commit and then create a new public version then. Thank you very much!
Thanks!
mabe hangup to