Closed tomasdeml closed 8 years ago
I realised the poor performance may be due to CPU contention between two caddy instances so I have repeated the tests against caddy_upstream being on another machine. The results are better, however the performance drop is still about 50%.
Baseline results for caddy_upstream v0.8.3 without proxy, command .\ab.exe -n 50000 -c 1000 -k http://remote-machine-upstream/
(note reduced concurrency and request number because of issue #938):
Concurrency Level: 1000
Time taken for tests: 5.597 seconds
Complete requests: 50000
Failed requests: 0
Keep-Alive requests: 50000
Total transferred: 21650000 bytes
HTML transferred: 11200000 bytes
Requests per second: 8932.93 [#/sec] (mean)
Time per request: 111.945 [ms] (mean)
Time per request: 0.112 [ms] (mean, across all concurrent requests)
Transfer rate: 3777.30 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.4 0 16
Processing: 0 106 33.6 109 535
Waiting: 0 106 33.6 109 535
Total: 0 106 33.6 109 535
Percentage of the requests served within a certain time (ms)
50% 109
66% 109
75% 109
80% 109
90% 109
95% 110
98% 125
99% 267
100% 535 (longest request)
And results with proxy, command .\ab.exe -n 50000 -c 1000 -k http://remote-machine-running-caddy/
:
Concurrency Level: 1000
Time taken for tests: 11.020 seconds
Complete requests: 50000
Failed requests: 0
Keep-Alive requests: 50000
Total transferred: 22400000 bytes
HTML transferred: 11200000 bytes
Requests per second: 4537.37 [#/sec] (mean)
Time per request: 220.392 [ms] (mean)
Time per request: 0.220 [ms] (mean, across all concurrent requests)
Transfer rate: 1985.10 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.4 0 16
Processing: 16 162 92.4 156 3304
Waiting: 16 162 92.4 156 3304
Total: 16 162 92.4 156 3304
Percentage of the requests served within a certain time (ms)
50% 156
66% 187
75% 203
80% 207
90% 240
95% 273
98% 375
99% 461
100% 3304 (longest request)
Edit Results for caddy v0.9.0: Baseline without proxy:
Concurrency Level: 1000
Time taken for tests: 8.428 seconds
Complete requests: 50000
Failed requests: 0
Keep-Alive requests: 50000
Total transferred: 22800000 bytes
HTML transferred: 11200000 bytes
Requests per second: 5932.95 [#/sec] (mean)
Time per request: 168.550 [ms] (mean)
Time per request: 0.169 [ms] (mean, across all concurrent requests)
Transfer rate: 2642.02 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.4 0 16
Processing: 0 162 81.0 156 2728
Waiting: 0 162 81.0 156 2728
Total: 0 162 81.0 156 2728
Percentage of the requests served within a certain time (ms)
50% 156
66% 172
75% 188
80% 203
90% 235
95% 285
98% 358
99% 404
100% 2728 (longest request)
And with proxy:
Concurrency Level: 1000
Time taken for tests: 26.484 seconds
Complete requests: 50000
Failed requests: 0
Keep-Alive requests: 50000
Total transferred: 23550000 bytes
HTML transferred: 11200000 bytes
Requests per second: 1887.95 [#/sec] (mean)
Time per request: 529.675 [ms] (mean)
Time per request: 0.530 [ms] (mean, across all concurrent requests)
Transfer rate: 868.38 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 19.0 0 3009
Processing: 16 404 653.0 313 12093
Waiting: 16 404 653.0 313 12093
Total: 16 404 653.6 313 12093
Percentage of the requests served within a certain time (ms)
50% 313
66% 359
75% 394
80% 418
90% 473
95% 531
98% 740
99% 3317
100% 12093 (longest request)
Ping between bench, proxy and upstream machines is max 2 ms.
Mind using https://github.com/wg/wrk or https://github.com/rakyll/boom ?
@abiosoft Thank you for suggesting other tools, I know ab
is not exactly state of the art. I repeated tests with boom
(again with proxy and upstream being different machines) and the performance differences look relatively the same.
Baseline for Caddy v0.8.3 without proxy (.\boom.exe -n 50000 -c 1000 http://remote-machine-upstream/
):
Summary:
Total: 5.5015 secs
Slowest: 5.2515 secs
Fastest: 0.0000 secs
Average: 0.1024 secs
Requests/sec: 9088.4154
Total data: 11200000 bytes
Size/request: 224 bytes
Status code distribution:
[200] 50000 responses
Response time histogram:
0.000 [1536] |∎
0.525 [48399] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1.050 [7] |
1.575 [6] |
2.101 [5] |
2.626 [6] |
3.151 [5] |
3.676 [5] |
4.201 [5] |
4.726 [5] |
5.252 [21] |
Latency distribution:
10% in 0.0625 secs
25% in 0.0937 secs
50% in 0.0938 secs
75% in 0.1094 secs
90% in 0.1108 secs
95% in 0.1406 secs
99% in 0.2812 secs
And with proxy (.\boom.exe -n 50000 -c 1000 http://remote-machine-with-caddy/
):
Summary:
Total: 11.3033 secs
Slowest: 9.0447 secs
Fastest: 0.0000 secs
Average: 0.1403 secs
Requests/sec: 4423.4731
Total data: 11200000 bytes
Size/request: 224 bytes
Status code distribution:
[200] 50000 responses
Response time histogram:
0.000 [1790] |∎
0.904 [47981] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1.809 [0] |
2.713 [0] |
3.618 [13] |
4.522 [215] |
5.427 [0] |
6.331 [0] |
7.236 [0] |
8.140 [0] |
9.045 [1] |
Latency distribution:
10% in 0.0312 secs
25% in 0.0781 secs
50% in 0.1250 secs
75% in 0.1719 secs
90% in 0.2031 secs
95% in 0.2343 secs
99% in 0.3906 secs
Baseline Caddy v0.9.0 without proxy:
Summary:
Total: 10.1248 secs
Slowest: 9.8588 secs
Fastest: 0.0000 secs
Average: 0.1360 secs
Requests/sec: 4938.3901
Total data: 11200000 bytes
Size/request: 224 bytes
Status code distribution:
[200] 50000 responses
Response time histogram:
0.000 [5909] |∎∎∎∎∎
0.986 [43676] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1.972 [10] |
2.958 [47] |
3.944 [38] |
4.929 [7] |
5.915 [67] |
6.901 [31] |
7.887 [93] |
8.873 [0] |
9.859 [122] |
Latency distribution:
25% in 0.0156 secs
50% in 0.0781 secs
75% in 0.1250 secs
90% in 0.1562 secs
95% in 0.1875 secs
99% in 0.3947 secs
And with proxy:
Summary:
Total: 23.5891 secs
Slowest: 11.4876 secs
Fastest: 0.0000 secs
Average: 0.2954 secs
Requests/sec: 2119.6269
Total data: 11200000 bytes
Size/request: 224 bytes
Status code distribution:
[200] 50000 responses
Response time histogram:
0.000 [642] |
1.149 [48942] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
2.298 [96] |
3.446 [26] |
4.595 [46] |
5.744 [27] |
6.893 [12] |
8.041 [15] |
9.190 [15] |
10.339 [15] |
11.488 [164] |
Latency distribution:
10% in 0.0797 secs
25% in 0.1718 secs
50% in 0.2344 secs
75% in 0.3125 secs
90% in 0.4062 secs
95% in 0.4687 secs
99% in 0.7049 secs
does https://github.com/mholt/caddy/pull/880 play any role in this?
@tomasdeml
With #984, you should be able to pass the ab benchmarks by increasing the keepalive
directive in your proxy. By default it is 2, and you should increase this depending on how many concurrent connections you expect. I'm not 100% sure what the correct value is here (I'm not super familiar with the pooling code in net/http/transport.go) so it will take some testing.
Feel free to reopen this - if you do, I'd also like to see some tests with nginx on the same hardware.
1. What version of Caddy are you running (
caddy -version
)?v0.8.3 / v0.9.0 OS Windows Server 2012 R2 running on Azure instance 'Standard D3 v2' (4 cores, 14 GB memory)
2. What are you trying to do?
Measure performance of caddy reverse proxy middleware using Apache Bench.
3. What is your entire Caddyfile?
caddyfile_proxy:
caddyfile_upstream:
The
WebRoot
folder contains fileindex.html
:4. How did you run Caddy (give the full command and describe the execution environment)?
and then in a new shell:
Rationale We would like to use caddy as a simple reverse proxy for one of our backend services. To measure performance of the proxy, I used Apache Bench and got sub-optimal results. For the benchmark I created a setup with one caddy instance acting as the proxy and another caddy instance representing the backend (upstream).
To establish a baseline for caddy performance, I ran
ab.exe -n 1000000 -c 1000 -k http://remote-machine-running-caddy:580/
from another machine against caddy v0.8.3. I got the following result (one of three runs, the other results were pretty similar):Then I started the proxy and ran
ab.exe -n 1000000 -c 1000 -k http://remote-machine-running-caddy/
. I got the following result (again one of three runs):I re-ran the tests against Caddy v0.9.0 and got even worse results. Without proxy:
Unfortunately I could not execute a run with proxy as it did not complete (see
Failed requests
), most likely because of issue #938:Is this kind of performance expected?