Open oliverbestmann opened 6 years ago
Hi @oliverbestmann! I happened to be looking into this while debugging a seemingly related bug on another project using the Consul API client (https://github.com/hashicorp/consul-terraform-sync/issues/146). I wanted to add my findings from following the reproduction steps you provided hopefully to help out anyone else.
I wrote up the test below to capture concurrent and serial requests using the Consul API client. It reveals that a new connection is established for concurrent requests. Once a pool of connections are established, subsequent serial requests reuse those existing connections.
$ go test ./client_test/ -v
=== RUN TestAPIClient_reusableConns
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
client_test.go:49: Number of services: 100
client_test.go:50: Request time: 9.65514ms
client_test.go:49: Number of services: 100
client_test.go:50: Request time: 9.994027ms
client_test.go:49: Number of services: 100
client_test.go:50: Request time: 9.494824ms
client_test.go:49: Number of services: 100
client_test.go:50: Request time: 9.994659ms
client_test.go:49: Number of services: 100
client_test.go:50: Request time: 10.192487ms
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 4.494578ms
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 3.266552ms
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 3.105789ms
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 3.33691ms
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 3.260026ms
--- PASS: TestAPIClient_reusableConns (0.05s)
PASS
ok github.com/hashicorp/consul/client_test 0.384s
$ go test ./client_test/ -v -count 1
=== RUN TestAPIClient_reusableConns
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
client_test.go:49: Nubmer of services: 100
client_test.go:50: Request time: 4.261913ms
client_test.go:49: Nubmer of services: 100
client_test.go:50: Request time: 4.370877ms
client_test.go:49: Nubmer of services: 100
client_test.go:50: Request time: 5.019223ms
client_test.go:49: Nubmer of services: 100
client_test.go:50: Request time: 4.301784ms
client_test.go:49: Nubmer of services: 100
client_test.go:50: Request time: 4.88288ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 2.380345ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 2.111134ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 2.150876ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 2.486228ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
client_test.go:61: Number of services: 100
client_test.go:62: Request time: 1.968497ms
client_test.go:66:
Error Trace: client_test.go:66
Error: Not equal:
expected: 10
actual : 5
Test: TestAPIClient_reusableConns
Messages: unexpected # of connections reused
--- FAIL: TestAPIClient_reusableConns (0.03s)
FAIL
FAIL github.com/hashicorp/consul/client_test 0.159s
FAIL
I tested this against consul/api v1.4.0 running go v1.14.6, and compared the relevant changes to Consul version v0.8.5. There were no major changes between v0.8.5 and the module v1.4.0 API client that were related to the reading of the body.
A difference however are the versions of Go to build from which includes the standard encoding/json
library. A trailing \n
may be a culprit where the JSON decoding of the response stops processing the reader and leaves some bytes causing the TLS connection to close and not be reused. (sourced from bradfitz https://github.com/golang/go/issues/20528#issuecomment-309170928 and Consul source code)
The JSON decoding library may have changed since then, I didn't dive deep into this, but it might account for the \n
now and reading the whole response body.
Please do share if you're observing that TLS connections are not being reused with newer versions of Consul.
consul version
for both Client and ServerClient:
0.8.5
Server:0.8.5
Operating system and Environment details
Go 1.8.3
Description of the Issue (and unexpected/desired result)
The consul api client in golang does not consume the complete body of the response received from the consul server. Because of that, the http (keep alive) connection can not be re-used and will be closed.
Reproduction steps
Use a connection that prints a message, each time a new connection is created, then call some endpoints, like
client.Health().Service("my-service", "", true, nil)
: