tsenart / vegeta

HTTP load testing tool and library. It's over 9000!
http://godoc.org/github.com/tsenart/vegeta/lib
MIT License
23.01k stars 1.34k forks source link

Vegeta does not honour the connections flag when rate is 0 #660

Open rsevilla87 opened 8 months ago

rsevilla87 commented 8 months ago

Version and Runtime

v12.11.0

Expected Behaviour

When I run vegeta with a fixed number of connections and max-connection with rate=0, I want vegeta to honour that number of connections and reuse these connections to send additional requests thanks to keep-alive

Actual Behaviour

Runing vegeta with a fixed number of connections/max-connections and rate=0

$ echo "GET $passthrough" | ./vegeta attack -workers=100 -max-workers=100  -duration=10s -insecure --connections=100  -max-connections=100 -rate=0  |./vegeta report
Requests      [total, rate, throughput]         2409, 240.77, 232.55
Duration      [total, attack, wait]             10.359s, 10.005s, 353.603ms
Latencies     [min, mean, 50, 90, 95, 99, max]  224.204ms, 422.294ms, 365.46ms, 543.652ms, 865.569ms, 1.635s, 1.782s
Bytes In      [total, mean]                     308352, 128.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:2409  
Error Set:

While in a different window I run a loop counting the number of established connections by vegeta:

$ while true; do ss -p state established dst :https | grep -c vegeta; sleep 1 ; done
0
0
100
70
61
64
73
66
66
68
68
0
0
0
^C

As you can see from the snippet above, the number of ESTABLISHED connections gets increased and reduced during the benchmark.

This behavior leads to different issues:

Steps to Reproduce

Use vegeta with a fixed number of connections unlimited rate and check the number of established connections in the benchmark runtime

Additional Context

tsenart commented 8 months ago

Just looked into the code and at first glance this shouldn't be happening. Can you run the same test against another test HTTP server like the on in the Vegeta report (go run internal/cmd/echosrv/main.go)?

I wonder if your server is closing connections in the middle of the test.

rsevilla87 commented 8 months ago

Just looked into the code and at first glance this shouldn't be happening. Can you run the same test against another test HTTP server like the on in the Vegeta report (go run internal/cmd/echosrv/main.go)?

I wonder if your server is closing connections in the middle of the test.

done:

client-side:

$ echo "GET http://localhost:8080" | ./vegeta attack -workers=100 -max-workers=100 -duration=10s -rate=0 -connections=100 -max-connections=100 | ./vegeta report
-workers=100 -duration=10s -rate=0 -connections=100 -max-connections=100 | ./vegeta report
Requests      [total, rate, throughput]         494708, 49471.42, 49467.68
Duration      [total, attack, wait]             10.001s, 10s, 757.083µs
Latencies     [min, mean, 50, 90, 95, 99, max]  28.177µs, 1.403ms, 768.966µs, 3.813ms, 4.879ms, 7.047ms, 21
.498ms
Bytes In      [total, mean]                     46997260, 95.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:cou

server-side

$ go run internal/cmd/echosrv/main.go  :8080                                                                                                                                                             
2023/10/20 18:29:59 Rate: 0.000/s
2023/10/20 18:30:02 Rate: 0.000/s
2023/10/20 18:30:03 Rate: 0.000/s
2023/10/20 18:30:04 Rate: 116794.970/s
2023/10/20 18:30:05 Rate: 81491.505/s
2023/10/20 18:30:06 Rate: 37320.733/s
2023/10/20 18:30:07 Rate: 33983.366/s
2023/10/20 18:30:08 Rate: 34049.093/s
2023/10/20 18:30:09 Rate: 33844.695/s
2023/10/20 18:30:10 Rate: 35762.164/s
2023/10/20 18:30:11 Rate: 35349.909/s
2023/10/20 18:30:12 Rate: 36349.133/s
2023/10/20 18:30:13 Rate: 39743.791/s
2023/10/20 18:30:14 Rate: 10099.974/s
^Csignal: interrupt

Counting connections

$ while true; do ss -p state established  | grep -c vegeta; sleep 1 ; done                                                                                               
0                                                          
0                                                          
100                                                        
100                                                        
100                                                        
100                                                        
100                                                        
100                                                        
100                                                        
100                                                        
0                                                          

I repeated the original test now with a plain http endpoint and I managed to get a constant number of 100 connections, however when I use an https termination some of the connections get recreated. i.e:

Using a public http endpoint

client-side:

$ echo "GET http://httpbin.org/ip" | ./vegeta attack -workers=100 -max-
workers=100 -duration=10s -rate=0 -connections=100 -max-connections=100 -http2=false  | ./vegeta report ; date
Requests      [total, rate, throughput]         8206, 820.54, 777.70
Duration      [total, attack, wait]             10.552s, 10.001s, 550.909ms
Latencies     [min, mean, 50, 90, 95, 99, max]  97.764ms, 123.147ms, 104.539ms, 131.087ms, 277.649ms, 444.41
8ms, 1.141s
Bytes In      [total, mean]                     262592, 32.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:8206  
Error Set:
Fri Oct 20 07:15:29 PM CEST 2023     # The test finished at this exact moment

And counting the time-wait sockets in the client-side:

$ while true; date; do ss -np state time-wait dst :http | wc -l  ; sleep 1;  done
Fri Oct 20 07:15:17 PM CEST 2023
1
Fri Oct 20 07:15:18 PM CEST 2023
1
Fri Oct 20 07:15:20 PM CEST 2023
12
Fri Oct 20 07:15:21 PM CEST 2023
12
Fri Oct 20 07:15:22 PM CEST 2023
12
Fri Oct 20 07:15:23 PM CEST 2023
12
Fri Oct 20 07:15:24 PM CEST 2023
12
Fri Oct 20 07:15:25 PM CEST 2023
12
Fri Oct 20 07:15:26 PM CEST 2023
12
Fri Oct 20 07:15:27 PM CEST 2023
12
Fri Oct 20 07:15:28 PM CEST 2023
12
Fri Oct 20 07:15:29 PM CEST 2023
12
Fri Oct 20 07:15:30 PM CEST 2023   # when the test finished the number of time-wait sockets got increased by 100, as expected
112
Fri Oct 20 07:15:31 PM CEST 2023
112
Fri Oct 20 07:15:32 PM CEST 2023
112
Fri Oct 20 07:15:33 PM CEST 2023
112
^C                 

And now using the https termination of the same endpoint:

client-side:

$ echo "GET https://httpbin.org/ip" | ./vegeta attack -workers=100 -max-workers=100 -duration=10s -rate=0 -connections=100 -max-connections=100 -http2=false  | ./vegeta report ; date
Requests      [total, rate, throughput]         7399, 739.66, 720.00
Duration      [total, attack, wait]             10.276s, 10.003s, 273.135ms
Latencies     [min, mean, 50, 90, 95, 99, max]  97.65ms, 136.042ms, 109.395ms, 175.29ms, 312.528ms, 522.59ms
, 4.305s
Bytes In      [total, mean]                     236768, 32.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:7399  
Error Set:
Fri Oct 20 07:17:12 PM CEST 2023   # finished at this moment

And counting the time-wait sockets in the client-side:

$ while true; date; do ss -np state time-wait dst :https | wc -l  ; sleep 1;  done                                                                                       
Fri Oct 20 07:16:58 PM CEST 2023
7
Fri Oct 20 07:16:59 PM CEST 2023
7
Fri Oct 20 07:17:00 PM CEST 2023
7
Fri Oct 20 07:17:01 PM CEST 2023
7
Fri Oct 20 07:17:02 PM CEST 2023
6
Fri Oct 20 07:17:03 PM CEST 2023
9
Fri Oct 20 07:17:04 PM CEST 2023
24
Fri Oct 20 07:17:05 PM CEST 2023
44
Fri Oct 20 07:17:06 PM CEST 2023
53
Fri Oct 20 07:17:07 PM CEST 2023
56
Fri Oct 20 07:17:08 PM CEST 2023
59
Fri Oct 20 07:17:09 PM CEST 2023
63
Fri Oct 20 07:17:10 PM CEST 2023
63
Fri Oct 20 07:17:11 PM CEST 2023
63
Fri Oct 20 07:17:12 PM CEST 2023  # Test finished here, however there were multiple sockets already in time-wait state
163
Fri Oct 20 07:17:13 PM CEST 2023
163
Fri Oct 20 07:17:14 PM CEST 2023
163
Fri Oct 20 07:17:15 PM CEST 2023
163
Fri Oct 20 07:17:16 PM CEST 2023
163

For some reason the https endpoints don't work as expected, connections don't persist and they get recreated, this miss behaviour is more noticeable in higher scale scenarios (more workers).

Hope this helps to clarify the issue

tsenart commented 8 months ago

Does your server run HTTP2? It could be that Vegeta doesn't need to grow the connection pool because HTTP2 has multiplexing in a single TCP connection.

rsevilla87 commented 8 months ago

Does your server run HTTP2? It could be that Vegeta doesn't need to grow the connection pool because HTTP2 has multiplexing in a single TCP connection.

The last test I shared was ran with -http2=false to prevent that behavior