Open gruns opened 2 years ago
Spoke with @DiegoRBaquero to tackle this issue. We concluded that:
Test Setup:
Both servers use the same SSL certs and both serve the same generic HTML file. The test setup is documented and made to be easily reproducible in this repo: https://github.com/filecoin-saturn/http-testing.
Tests were run for a fixed duration of 10mins. The number of concurrent users was changed for each test.
Results :
NOTE: These results are now stale, please check comments below for updated results
Service | Protocol | TTFB Mean | Failure Rate | Reqs/s | Concurrent Clients |
---|---|---|---|---|---|
Nginx | HTTP/2 | 4.77ms | 4% | 16215.11 | 5 |
Caddy | HTTP/3 | 9.38ms | 0% | 10942.86 | 5 |
Nginx | HTTP/2 | 55.7ms | 6.7% | 2143.27 | 30 |
Caddy | HTTP/3 | 53.22ms | 0% | 1866.49 | 30 |
Nginx | HTTP/2 | 123.8ms | 6.5% | 637.36 | 100 |
Caddy | HTTP/3 | 116.78ms | 0% | 573.19 | 100 |
Nginx | HTTP/2 | 289.95ms | 7.4% | 212.81 | 300 |
Caddy | HTTP/3 | 300.40ms | 0% | 223.59 | 300 |
Interesting that the failure rate was 0 with Caddy but significant with Nginx 🤔
@gruns @DiegoRBaquero Update on this issue. I noticed that I was pinging each service using a different host on my local machine. I changed it to be the same for both of them, which is the docker host for the benchmarking tool I was running. I re-ran all the tests with the same setup and the results now tell a different story on Caddy's HTTP/3 capabilities.
At 300 concurrent clients, we half the TTFB compared to Nginx. One odd thing is that Nginx's failure rate decreases as the load increases on the webserver.
Service | Protocol | TTFB Mean | Failure Rate | Avg Reqs/s | Concurrent Clients |
---|---|---|---|---|---|
Nginx | HTTP/2 | 18.08 | 8% | 7874.58 | 5 |
Caddy | HTTP/3 | 10.71 | 0% | 12600.37 | 5 |
Nginx | HTTP/2 | 53.44ms | 5.3% | 1429.01 | 30 |
Caddy | HTTP/3 | 44.31ms | 0% | 2161.78 | 30 |
Nginx | HTTP/2 | 204.73ms | 3% | 416.33 | 100 |
Caddy | HTTP/3 | 129.87ms | 0% | 633.46 | 100 |
Nginx | HTTP/2 | 730.78ms | 1.3% | 123.98 | 300 |
Caddy | HTTP/3 | 363.12ms | 0% | 156.38 | 300 |
@gruns @DiegoRBaquero I also setup a docker image that runs our nginx setup with http3 setup. The image is in the http-testing repo here. For some reason, nginx's HTTP3 implementation is much slower than what we have right now. I contacted on of the developers and they mentioned that QUIC on nginx will probably not be ready until end of 2023.
Here are some of the testing results:
Service | Protocol | TTFB Mean | Failure Rate | Avg Reqs/s | Concurrent Clients |
---|---|---|---|---|---|
Nginx | HTTP/2 | 53.44ms | 5.3% | 1429.01 | 30 |
Nginx | HTTP/3 | 271.17ms | 0.2% | 519.04 | 30 |
Nginx | HTTP/2 | 204.73ms | 3% | 416.33 | 100 |
Nginx | HTTP/3 | 1130ms | 0.4% | 141.82 | 100 |
Nginx | HTTP/2 | 730.78ms | 1.3% | 123.98 | 300 |
Nginx | HTTP/3 | 4460ms | 0% | 100.7 | 300 |
I ran more tests by deploying Nginx and Caddy on AWS lightsail instances in Frankfurt (eu-central-1).
With 5 Concurrent Clients:
Service | Protocol | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients |
---|---|---|---|---|---|
Nginx | HTTP/2 | 348.20ms | 0.7% | 80.4 | 5 |
Caddy | HTTP/2 | 432.36ms | 0% | 71 | 5 |
Nginx |
HTTP/3 |
222.2ms | 0% | 66.35 | 5 |
Caddy |
HTTP/3 |
330.82ms | 0% | 88.6 | 5 |
With 30 Concurrent Clients:
Service | Protocol | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients |
---|---|---|---|---|---|
Nginx | HTTP/2 | 389.85ms | 0.6% | 76.29 | 30 |
Caddy | HTTP/2 | 456.13ms | 0% | 62.32 | 30 |
Nginx | HTTP/3 | 250.52ms | 0% | 70.59 | 30 |
Caddy | HTTP/3 | 379.93ms | 0% | 82.13 | 30 |
With 100 Concurrent Clients:
Service | Protocol | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients |
---|---|---|---|---|---|
Nginx | HTTP/2 | 454.82ms | 0.6% | 65.27 | 100 |
Caddy | HTTP/2 | 563.34ms | 0% | 36.8 | 100 |
Nginx | HTTP/3 | 751.78ms | 0.02% | 67 | 100 |
Caddy | HTTP/3 | 448.59ms | 0% | 32.59 | 100 |
promising!! 🎉
what's the ping/RTT between you (the client) and the frankfurt lightsail instance (the server)?
also interesting that its inconsistent whether caddy or nginx is faster, depending on the # of concurrent clients. at 5 and 30 concurrent clients, nginx's http3 ttfb (220ms, 250ms) is faster than caddy's http3 ttfb (330ms, 380ms), but at 100 concurrent clients caddy's http3 ttfb (450ms) is lower than nginx's http3 ttfb (750ms). interesting. and weird. 🤔
also nginx's http3 ttfb (750ms) was way worse than nginx's http2 ttfb (450ms) at 100 concurrent clients. besides that one measurement (nginx's http3 ttfb at 100 concurrents) the data makes sense and http3's ttfb is lower across the board for all # of concurrents for both nginx and caddy
@AmeanAsad worth re-running the experiment with 100 concurrent clients and, perhaps, 200, too. if your computer was the client, maybe something else was using bw in the background while the nginx http3 ttfb test ran?
jk. likely not an abberation. from looking at the local benchmarks posted above, nginx's http3 implementation seems to choke on high numbers of concurrent clients. as shown in these benchmarks:
Service | Protocol | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients |
---|---|---|---|---|---|
Nginx | HTTP/2 | 204.73ms | 3% | 416.33 | 100 |
Nginx | HTTP/3 | 1130ms | 0.4% | 141.82 | 100 |
Nginx | HTTP/2 | 730.78ms | 1.3% | 123.98 | 300 |
Nginx | HTTP/3 | 4460ms | 0% | 100.7 | 300 |
@gruns Pinged poth servers using ping <server-ip>
commmand. The time hovers at 100ms consistently.
Result from terminal:
round-trip min/avg/max/stddev = 99.916/101.709/103.361/1.251 ms
@gruns @DiegoRBaquero
In the previous update, http3 performs worse than http2 at 100 concurrent clients. The intuitive assumption is that the trend continues as concurrency rises, but further testing shows otherwise. I tried testing this multiple times and even setup another VPS in Singapore to test it, but it seems like there is a concurrency range where http2 is just better. This is puzzling and still does not make sense. Otherwise, across the board, http3 is a clear winner.
Service | Protocol | TTFB Mean | Reqs/s | Concurrent Clients |
---|---|---|---|---|
Nginx | HTTP/2 | 2.88s | 31 | 200 |
Nginx | HTTP/3 | 1.05s | 46.27 | 200 |
Nginx | HTTP/2 | 10.81s | 24.4 | 300 |
Nginx | HTTP/3 | 2.46s | 35.1 | 300 |
Service | Protocol | TTFB Mean | Avg Reqs/s | Concurrent Clients |
---|---|---|---|---|
Nginx | HTTP/2 | 2.88s | 31 | 200 |
Caddy | HTTP/2 | 1.05s | 12.59 | 200 |
Nginx | HTTP/2 | 729.21ms | 13.87 | 200 |
Nginx'a CPU consumption:
Caddy's CPU consumptions under the same load:
next questions:
how do browsers, eg chrome and firefox, know when to connect to a site with http3 over http2?
if only after the Alt-Svc
header (https://http3-explained.haxx.se/en/h3/h3-altsvc) has been sent over http2, do browsers remember that header so subsequent connections are made with http3 instead of http2?
what are the benchmarks for one client, one request at a time? like with curl (https://curl.se/docs/http3.html)
do we see see a clear 1RTT for http3 and 3RTT for http2?
nginx's http3 implementation outperforms caddy's http3 implementation, while both http3 implementations are in beta. can we use nginx's http3 support in production while in beta? what work remains outstanding in nginx's http3 support? what is missing? incomplete? broken? what risks do we adopt by adopting nginx's current http3 support in production?
what work needs to be done to benchmark http3 vs http2 in our test network?
@gruns Addressing some of the questions above:
Once an HTTP3 connection is established, does the browser remember that for subsequent connections?:
Alt-Svc
header. The caching is identified by a param called ma
-> max age. This param is configurable in our server setup. Here is an example of how I used it in the Nginx http3 setup.
SourceBenchmarks for 1 Client:
Protocol | TTFB | Time to Connect |
---|---|---|
HTTP/2 | 331.97ms | 221.38ms |
HTTP/3 | 217.76ms | 111.79ms |
Does the benchmarking tool open new connections for each request?:
How ready is nginx's http3?:
What work needs to be done to benchmark http3 vs http2 in our test network?: Gonna also @DiegoRBaquero, since he probably has a better idea of this than me. Here is what I imagine the work will be like:
Can a page, which loaded over http2 (ie a page with arc), request an asset from another domain, eg strn.pl, over http3?
Alt-Svc
header that indicates HTTP/3, it has the option to attempt to setup a QUIC connection to that destination and if successful, continue communicating with the origin like that instead of the initial HTTP version. This is of course conditional on the fact that the client and server must both support QUIC and HTTP/3. sickkkk. remaining todos now for @joaosa with the baton 💪:
[x] test how chrome and firefox (https://caniuse.com/http3) actually implement http3 connections
Alt-Svc
upgrade response header?Alt-Svc
's caniuse score 50% (https://caniuse.com/mdn-http_headers_alt-svc) while http3's caniuse score is 75% (https://caniuse.com/http3)[x] determine the state of nginx's http3 support. what is missing? what is broken? can we put nginx's http3 implementation into production in saturn? hopefully 🤞
[x] if everything above is good, add http3 support to l1s
[x] deploy the http3 l1s to the test network
[x] begin logging the ttfb performance difference between http2 and http3 in the test network (for @guanzo)
[ ] graph the http2 vs http3 log data to the grafana dashboads (for @joaosa or @AmeanAsad)
[ ] when all looks good and things have proven stable in the test network, push to production and pop that champagne 🎉
@joaosa once you get a sense of the above items, sync with me (@gruns) on an estimated timeline for the above 👍
Alright, I ran some benchmarks with Envoy and HAProxy acting as reverse-proxies and with simplehttp2server as their backend. These proxies use http3/http2 downstream and http2 upstream.
I tried to replicate Amean's setup in order to have "comparable" results (refer to here). I ran the benchmarking tool (h2load) from my machine towards a couple of lightsail Ubuntu 20.04 machines I deployed.
I generated certs for them with mkcert and modified my /etc/hosts
file to be able to reach them.
I'm going to comment on the results scenario by scenario (http2 vs http3), explain a couple of things I tried and then go for the final benchmarks. @DiegoRBaquero @AmeanAsad @gruns It would be great to have your input on this. Hopefully, together we'll uncover more stuff from these results!
docker run --rm -it --network=host h2load-http3 -n 10000 -c 100 -m 10 --npn-list h3 https://haproxy.io
finished in 6.37s, 1570.85 req/s, 1.12MB/s
requests: 10000 total, 10000 started, 10000 done, 10000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 10000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 7.13MB (7480000) total, 2.91MB (3050000) headers (space savings -5.90%), 4.17MB (4370000) data
UDP datagram: 6351 sent, 21426 received
min max mean sd +/- sd
time for request: 43.68ms 2.06s 526.60ms 323.54ms 80.59%
time for connect: 56.76ms 1.12s 311.52ms 380.30ms 81.00%
time to 1st byte: 314.43ms 2.63s 1.02s 556.59ms 76.00%
req/s : 15.72 26.87 17.86 2.10 85.00%
docker run --rm -it --network=host h2load-http3 -n 10000 -c 100 -m 10 --npn-list h2 https://haproxy.io
finished in 6.38s, 1566.83 req/s, 1.01MB/s
requests: 10000 total, 10000 started, 10000 done, 10000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 10000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 6.45MB (6763600) total, 2.11MB (2210000) headers (space savings 23.26%), 4.17MB (4370000) data
min max mean sd +/- sd
time for request: 46.24ms 2.96s 543.88ms 377.94ms 80.79%
time for connect: 92.19ms 209.09ms 153.40ms 31.77ms 60.00%
time to 1st byte: 241.91ms 1.93s 721.64ms 343.79ms 77.00%
req/s : 15.67 27.09 17.74 2.15 89.00%
Clearly, h2's TTFB is way better than h3's. Also those max values on h3 isn't looking good. Support for h3 in HAProxy is experimental, so this probably needs to be revisited as it evolves.
docker run --rm -it --network=host h2load-http3 -n 10000 -c 100 -m 10 --npn-list h3 https://envoy.io
finished in 9.12s, 1096.93 req/s, 499.14KB/s
requests: 10000 total, 10000 started, 10000 done, 10000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 10000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 4.44MB (4659527) total, 127.00KB (130050) headers (space savings 95.70%), 4.21MB (4410000) data
UDP datagram: 10436 sent, 16728 received
min max mean sd +/- sd
time for request: 46.87ms 1.18s 283.23ms 171.84ms 72.76%
time for connect: 78.60ms 7.28s 2.64s 2.89s 74.00%
time to 1st byte: 197.01ms 7.46s 2.98s 2.80s 74.00%
req/s : 10.99 34.94 20.29 6.55 57.00%
docker run --rm -it --network=host h2load-http3 -n 10000 -c 100 -m 10 --npn-list h2 https://envoy.io
finished in 6.18s, 1618.58 req/s, 764.33KB/s
requests: 10000 total, 10000 started, 10000 done, 10000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 10000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 4.61MB (4835551) total, 234.42KB (240051) headers (space savings 93.58%), 4.21MB (4410000) data
min max mean sd +/- sd
time for request: 50.65ms 2.50s 550.51ms 381.71ms 84.86%
time for connect: 94.38ms 258.03ms 159.05ms 43.32ms 65.00%
time to 1st byte: 447.23ms 1.10s 675.79ms 251.45ms 75.00%
req/s : 16.20 23.58 17.26 1.27 88.00%
Envoy seems to have a slightly higher h3 throughput when compared to h2 and both approaches for HAProxy. The TTFB/connect values in this attempt are terrible though. Thankfully, we later managed to solve this.
By now, we have established h2 is working fine, so let's try to improve h3. One thing we can improve is the the size of the UDP buffers (which Envoy thankfully complained about). I took the values from here.
net.core.rmem_max
and net.core.rmem_max
tweaksRan sudo sysctl -w net.core.rmem_max=26214400
and sudo sysctl -w net.core.rmem_max=26214400
beforehand.
I tried larger values, but that didn't seem to make a relevant difference.
finished in 7.23s, 1383.58 req/s, 631.24KB/s
requests: 10000 total, 10000 started, 10000 done, 10000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 10000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 4.46MB (4671842) total, 126.95KB (130000) headers (space savings 95.71%), 4.21MB (4410000) data
UDP datagram: 10907 sent, 16830 received
min max mean sd +/- sd
time for request: 47.21ms 2.31s 633.94ms 297.75ms 71.00%
time for connect: 78.47ms 565.65ms 397.77ms 112.46ms 55.00%
time to 1st byte: 384.95ms 1.47s 898.50ms 255.63ms 53.00%
req/s : 13.87 17.09 14.41 0.57 88.00%
This is great! Envoy's TTFB performance looks like what we would expect, even if its throughput values decreased.
finished in 83.14s, 14.43 req/s, 10.54KB/s
requests: 10000 total, 2080 started, 1200 done, 1200 succeeded, 8800 failed, 8800 errored, 0 timeout
status codes: 1200 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 876.56KB (897600) total, 357.42KB (366000) headers (space savings -5.90%), 512.11KB (524400) data
UDP datagram: 3474 sent, 3772 received
min max mean sd +/- sd
time for request: 47.75ms 386.50ms 98.43ms 75.74ms 90.00%
time for connect: 50.83ms 185.47ms 114.69ms 41.94ms 55.00%
time to 1st byte: 298.00ms 437.64ms 374.51ms 46.10ms 58.33%
req/s : 0.00 98.87 11.16 30.39 88.00%
RIP HAProxy. This approach clearly didn't help and it even introduced consistent failed requests to the mix. Definitely a no go.
With this in mind let's focus a bit more on the tweaked Envoy and regular HAProxy and see where this goes.
From here on, I tried reproducing Amean's test setup and 10 minute benchmarks for Envoy with 100, 200, and 300 clients.
One thing I immediately noticed with docker run --rm -it --network=host h2load-http3 -c100 -m200 --duration=600 --warm-up-time=5 --npn-list h3 https://envoy.io
, is that Envoy's started spiking gloriously to 99% CPU usage. When the test finished we had about 93% failed reqs (bad Envoy).
I initially assumed this meant Envoy was a lot less resource efficient than Nginx. I spawned a more powerful machine to try to make this work (4 vs 1 vCPUs and 16x more memory as a side-effect). Still, I got failed requests.
In the end, it turned it was the max concurrent streams to issue per client
(aka -m
) causing all of kinds of mischief (we should probably revisit stream concurrency later).
I went back to the smaller machine, so as not to give Envoy an unfair advantage.
For clarity's sake here are the runs and their results:
docker run --rm -it --network=host h2load-http3 -c100 --duration=600 --npn-list h3 https://envoy.io
finished in 600.02s, 1116.12 req/s, 507.63KB/s
requests: 669675 total, 669775 started, 669675 done, 669675 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 669675 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 297.44MB (311889144) total, 8.41MB (8820584) headers (space savings 95.64%), 281.65MB (295326675) data
UDP datagram: 2353223 sent, 2091152 received
min max mean sd +/- sd
time for request: 42.22ms 883.66ms 89.54ms 21.89ms 81.58%
time for connect: 74.85ms 426.47ms 356.05ms 65.12ms 71.00%
time to 1st byte: 353.05ms 548.92ms 448.12ms 60.78ms 71.00%
req/s : 10.62 11.82 11.16 0.35 62.00%
docker run --rm -it --network=host h2load-http3 -c200 --duration=600 --npn-list h3 https://envoy.io
finished in 600.03s, 1177.70 req/s, 539.16KB/s
requests: 706622 total, 706822 started, 706622 done, 706622 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 706691 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 315.91MB (331259764) total, 8.86MB (9292259) headers (space savings 95.65%), 297.18MB (311620302) data
UDP datagram: 2767907 sent, 2295509 received
min max mean sd +/- sd
time for request: 42.84ms 1.27s 169.59ms 47.65ms 72.94%
time for connect: 73.29ms 1.36s 667.19ms 348.45ms 71.00%
time to 1st byte: 463.84ms 1.52s 845.13ms 338.81ms 79.50%
req/s : 5.63 6.21 5.89 0.16 60.00%
docker run --rm -it --network=host h2load-http3 -c300 --duration=600 --npn-list h3 https://envoy.io
finished in 600.05s, 1225.51 req/s, 564.06KB/s
requests: 735307 total, 735607 started, 735307 done, 735307 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 735318 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 330.50MB (346559021) total, 9.20MB (9643006) headers (space savings 95.67%), 309.25MB (324270387) data
UDP datagram: 2672506 sent, 2158533 received
min max mean sd +/- sd
time for request: 43.50ms 811.13ms 244.33ms 62.76ms 73.98%
time for connect: 74.73ms 3.35s 1.04s 690.06ms 76.00%
time to 1st byte: 366.25ms 3.62s 1.24s 702.55ms 87.33%
req/s : 3.96 4.21 4.08 0.07 59.67%
Follow the same approach as with Envoy:
docker run --rm -it --network=host h2load-http3 -c100 --duration=600 --npn-list h3 https://haproxy.io
finished in 600.04s, 1424.00 req/s, 1.02MB/s
requests: 854398 total, 854498 started, 854398 done, 854398 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 854421 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 609.49MB (639096788) total, 248.53MB (260598405) headers (space savings -5.90%), 356.08MB (373371926) data
UDP datagram: 2524842 sent, 2647903 received
min max mean sd +/- sd
time for request: 42.13ms 374.40ms 70.15ms 7.74ms 74.16%
time for connect: 53.00ms 3.07s 569.44ms 733.84ms 95.00%
time to 1st byte: 171.35ms 3.14s 689.34ms 708.79ms 95.00%
req/s : 13.93 14.59 14.24 0.21 55.00%
docker run --rm -it --network=host h2load-http3 -c200 --duration=600 --npn-list h3 https://haproxy.io
finished in 600.06s, 1545.79 req/s, 1.10MB/s
requests: 927476 total, 927676 started, 927476 done, 927476 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 927529 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 661.63MB (693768372) total, 269.79MB (282897565) headers (space savings -5.90%), 386.53MB (405307012) data
UDP datagram: 2755539 sent, 2891841 received
min max mean sd +/- sd
time for request: 42.96ms 613.39ms 129.14ms 18.31ms 85.19%
time for connect: 60.12ms 3.12s 1.04s 1.09s 81.00%
time to 1st byte: 197.06ms 3.51s 1.26s 1.10s 81.00%
req/s : 7.65 7.80 7.73 0.03 65.50%
docker run --rm -it --network=host h2load-http3 -c300 --duration=600 --npn-list h3 https://haproxy.io
finished in 600.07s, 1481.49 req/s, 1.06MB/s
requests: 888892 total, 889151 started, 888892 done, 888892 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 888906 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 634.09MB (664895528) total, 258.56MB (271116635) headers (space savings -5.90%), 370.46MB (388451922) data
UDP datagram: 2667025 sent, 2788067 received
min max mean sd +/- sd
time for request: 44.44ms 616.11ms 174.32ms 22.75ms 89.36%
time for connect: 70.68ms 7.14s 1.65s 2.02s 91.51%
time to 1st byte: 226.89ms 7.23s 1.88s 1.98s 91.51%
req/s : 0.00 5.78 4.94 1.97 86.33%
Taking Amean's results and these, here we go:
Service | Protocol | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients |
---|---|---|---|---|---|
Caddy | HTTP/3 | 448.59ms | 0.00% | 32.59 | 100 |
Nginx | HTTP/3 | 751.78ms | 0.02% | 67 | 100 |
Envoy | HTTP/3 | 448.12ms | 0.00% | 11.16 | 100 |
HAProxy | HTTP/3 | 689.34ms | 0.00% | 14.24 | 100 |
Envoy | HTTP/3 | 845.13ms | 0.00% | 5.89 | 200 |
HAProxy | HTTP/3 | 1.26s | 0.00% | 7.73 | 200 |
Envoy | HTTP/3 | 1.24s | 0.00% | 4.08 | 300 |
HAProxy | HTTP/3 | 1.88s | 0.00% | 4.94 | 300 |
Envoy's TTFB values seem pretty decent given what we've seen in terms of h3 overall (it uses Cloudflare's quiche). Also, the devs say "HTTP/3 downstream support is ready for production use, but continued improvements are coming (...)" over here. HAProxy appears to be doing less well, but with a slightly higher throughput.
One aspect I'm concerned about in both scenarios is perceived throughput, which seems to be lower than with NGINX/Caddy. Given I'm using an reverse-proxy backend (with h2) for both HAProxy and Envoy, results may not be exactly comparable as there is another moving piece whose performance could be impacting the results. What are your thoughts on this?
Either way, I warrant these are good enough reasons to try Envoy (and possibly HAProxy) out in more realistic scenarios. What do you all think?
I decided to benchmark nginx as well to allow for comparisons following the same methodology as described above. I added a proxy backend for nginx. We're not caching responses, so as to get fairer values.
I should note I'm using httpd as a http1.1 backend here.
rmem
changes)docker run --rm -it --network=host h2load-http3 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.04s, 166.53 req/s, 122.54KB/s
requests: 99917 total, 100000 started, 99917 done, 99917 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 71.80MB (75291700) total, 24.13MB (25300000) headers (space savings 27.92%), 46.83MB (49100000) data
UDP datagram: 236898 sent, 300403 received
min max mean sd +/- sd
time for request: 44.84ms 1.27s 116.58ms 53.29ms 81.02%
time for connect: 53.03ms 3.12s 749.35ms 841.66ms 92.00%
time to 1st byte: 161.89ms 3.23s 905.40ms 849.89ms 92.00%
req/s : 8.26 8.97 8.52 0.11 75.00%
docker run --rm -it --network=host h2load-http3 -c200 --duration=600 --npn-list h3 https://nginx.io
finished in 600.11s, 47.81 req/s, 35.13KB/s
requests: 28685 total, 28872 started, 28685 done, 28685 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 28685 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 20.58MB (21583959) total, 6.92MB (7257305) headers (space savings 27.92%), 13.43MB (14084335) data
UDP datagram: 61918 sent, 87504 received
min max mean sd +/- sd
time for request: 44.94ms 1.10s 214.38ms 66.03ms 85.44%
time for connect: 52.72ms 7.26s 2.67s 2.75s 76.47%
time to 1st byte: 178.57ms 8.35s 3.02s 2.90s 76.47%
req/s : 0.00 2.79 2.19 0.65 93.50%
docker run --rm -it --network=host h2load-http3 -c300 --duration=600 --npn-list h3 https://nginx.io
finished in 600.09s, 326.53 req/s, 256.78KB/s
requests: 195919 total, 196000 started, 195919 done, 195919 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 196000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 150.46MB (157763732) total, 47.29MB (49588000) headers (space savings 27.92%), 101.50MB (106428000) data
UDP datagram: 398033 sent, 589867 received
min max mean sd +/- sd
time for request: 44.05ms 1.05s 202.65ms 56.23ms 82.53%
time for connect: 67.43ms 7.24s 2.54s 2.71s 78.06%
time to 1st byte: 193.00ms 8.27s 2.87s 2.86s 78.06%
req/s : 0.00 5.04 3.18 2.32 65.33%
These last two results aren't good.
Thankfully, h2load decided to present me with ERR_DRAINING
errors. After some googling, I decided to try out tweaking rmem
, as we had previously done with Envoy and HAProxy.
rmem
changes)sudo sysctl -w net.core.rmem_max=26214400 && sudo sysctl -w net.core.rmem_default=26214400
docker run --rm -it --network=host h2load-http3 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.05s, 166.54 req/s, 131.01KB/s
requests: 99923 total, 100000 started, 99923 done, 99923 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 76.76MB (80491700) total, 24.13MB (25300000) headers (space savings 27.92%), 51.78MB (54300000) data
UDP datagram: 235805 sent, 300736 received
min max mean sd +/- sd
time for request: 65.04ms 884.09ms 117.02ms 52.10ms 80.88%
time for connect: 53.70ms 256.58ms 153.87ms 59.01ms 56.00%
time to 1st byte: 258.63ms 395.89ms 326.88ms 41.13ms 56.00%
req/s : 8.36 8.75 8.53 0.08 71.00%
docker run --rm -it --network=host h2load-http3 -c200 --duration=600 --npn-list h3 https://nginx.io
finished in 600.07s, 48.59 req/s, 38.17KB/s
requests: 29156 total, 29356 started, 29156 done, 29156 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 29156 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 22.37MB (23453633) total, 7.03MB (7376468) headers (space savings 27.92%), 15.10MB (15831708) data
UDP datagram: 60628 sent, 88193 received
min max mean sd +/- sd
time for request: 171.05ms 930.30ms 245.49ms 46.77ms 82.83%
time for connect: 49.91ms 452.35ms 250.84ms 117.38ms 58.50%
time to 1st byte: 452.46ms 725.45ms 589.88ms 79.27ms 57.50%
req/s : 2.13 2.25 2.21 0.02 69.50%
docker run --rm -it --network=host h2load-http3 -c300 --duration=600 --npn-list h3 https://nginx.io
finished in 600.11s, 499.93 req/s, 393.03KB/s
requests: 299957 total, 300000 started, 299957 done, 299957 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 300000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 230.29MB (241475100) total, 72.38MB (75900000) headers (space savings 27.92%), 155.35MB (162900000) data
UDP datagram: 603054 sent, 902439 received
min max mean sd +/- sd
time for request: 66.37ms 1.24s 374.83ms 31.22ms 97.20%
time for connect: 68.27ms 670.26ms 355.15ms 175.99ms 57.67%
time to 1st byte: 651.95ms 1.12s 884.57ms 131.39ms 57.33%
req/s : 2.65 2.68 2.67 0.01 69.33%
Including the nginx results with the rmem
tweaks. It seems like nginx is after all the best performer in terms of TTFB (even if with a smaller throughput).
I suppose that this means there's another thing we can try nginx-wise. We can tweak the net.core.rmem_max
and net.core.rmem_default
server-side and see where that goes. Any thoughts?
Service | Protocol | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients |
---|---|---|---|---|---|
Nginx | HTTP/3 | 326.88ms | 0.00% | 8.53 | 100 |
Envoy | HTTP/3 | 448.12ms | 0.00% | 11.16 | 100 |
HAProxy | HTTP/3 | 689.34ms | 0.00% | 14.24 | 100 |
Nginx | HTTP/3 | 589.88ms | 0.00% | 2.21 | 200 |
Envoy | HTTP/3 | 845.13ms | 0.00% | 5.89 | 200 |
HAProxy | HTTP/3 | 1.26s | 0.00% | 7.73 | 200 |
Nginx | HTTP/3 | 884.57ms | 0.00% | 2.67 | 300 |
Envoy | HTTP/3 | 1.24s | 0.00% | 4.08 | 300 |
HAProxy | HTTP/3 | 1.88s | 0.00% | 4.94 | 300 |
@joaosa Really interesting results. The rmem
changes for nginx seem to have really good ttfb results compared to everything else. I am curious on how concurrent streams impacts the results and how applicable that is to Saturn since all the recent results are using 1 concurrent stream by default?
Let's look into concurrent streams (where m=1
are aka the results above). From here and the RFC, "It is recommended that this value be no smaller than 100, so as to not unnecessarily limit parallelism.".
docker run --rm -it --network=host h2load-http3 -m10 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.07s, 166.66 req/s, 131.00KB/s
requests: 100002 total, 100143 started, 100002 done, 99995 succeeded, 7 failed, 4 errored, 0 timeout
status codes: 99997 2xx, 0 3xx, 0 4xx, 3 5xx
traffic: 76.76MB (80489033) total, 24.13MB (25299403) headers (space savings 27.92%), 51.78MB (54298821) data
UDP datagram: 229913 sent, 266437 received
min max mean sd +/- sd
time for request: 42.17ms 53.25s 1.06s 789.42ms 88.03%
time for connect: 48.69ms 262.68ms 162.26ms 59.54ms 58.00%
time to 1st byte: 271.19ms 15.58s 1.25s 2.16s 97.00%
req/s : 9.21 9.95 9.39 0.15 79.00%
Interesting to see how performance degraded substantially here.
docker run --rm -it --network=host h2load-http3 -m10 -c100 --duration=600 --npn-list h3 https://envoy.io
finished in 600.04s, 1491.68 req/s, 680.32KB/s
requests: 895009 total, 896009 started, 895009 done, 894608 succeeded, 401 failed, 0 errored, 0 timeout
status codes: 894750 2xx, 0 3xx, 0 4xx, 401 5xx
traffic: 398.62MB (417987614) total, 11.13MB (11667962) headers (space savings 95.70%), 376.29MB (394563598) data
UDP datagram: 958395 sent, 1286893 received
min max mean sd +/- sd
time for request: 84.11ms 3.86s 669.28ms 297.25ms 73.73%
time for connect: 98.56ms 885.24ms 459.46ms 133.31ms 65.00%
time to 1st byte: 411.02ms 1.40s 899.76ms 252.07ms 59.00%
req/s : 14.55 15.19 14.92 0.13 68.00%
Performance degraded.
docker run --rm -it --network=host h2load-http3 -m10 -c100 --duration=600 --npn-list h3 https://haproxy.io
finished in 600.06s, 1721.59 req/s, 1.23MB/s
requests: 1032956 total, 1033956 started, 1032956 done, 1032956 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1033070 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 736.89MB (772686200) total, 300.49MB (315086350) headers (space savings -5.90%), 430.49MB (451401772) data
UDP datagram: 568336 sent, 2225495 received
min max mean sd +/- sd
time for request: 48.81ms 6.32s 580.12ms 485.98ms 86.04%
time for connect: 50.72ms 1.14s 334.37ms 419.83ms 77.00%
time to 1st byte: 281.95ms 2.70s 948.17ms 500.56ms 63.00%
req/s : 15.86 18.19 17.22 0.49 70.00%
Performance degraded.
docker run --rm -it --network=host h2load-http3 -m50 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.04s, 123.32 req/s, 95.16KB/s
requests: 74163 total, 75683 started, 74163 done, 72126 succeeded, 2037 failed, 171 errored, 0 timeout
status codes: 72126 2xx, 0 3xx, 0 4xx, 1866 5xx
traffic: 55.76MB (58464682) total, 17.50MB (18348642) headers (space savings 28.01%), 37.63MB (39463148) data
UDP datagram: 150490 sent, 171274 received
min max mean sd +/- sd
time for request: 42.60ms 92.65s 3.35s 11.72s 96.20%
time for connect: 50.27ms 249.07ms 149.91ms 59.04ms 57.00%
time to 1st byte: 261.25ms 17.12s 2.22s 2.19s 86.52%
req/s : 0.00 21.98 9.67 4.79 75.00%
Worse than m=10
.
docker run --rm -it --network=host h2load-http3 -m50 -c100 --duration=600 --npn-list h3 https://envoy.io
finished in 600.07s, 5799.28 req/s, 742.80KB/s
requests: 3479569 total, 3484569 started, 3479569 done, 313151 succeeded, 3166418 failed, 0 errored, 0 timeout
status codes: 313348 2xx, 0 3xx, 0 4xx, 3166439 5xx
traffic: 435.24MB (456377946) total, 28.08MB (29443038) headers (space savings 93.55%), 376.32MB (394598162) data
UDP datagram: 219142 sent, 512249 received
min max mean sd +/- sd
time for request: 47.83ms 10.43s 848.50ms 1.06s 88.02%
time for connect: 80.27ms 1.30s 759.14ms 354.78ms 61.00%
time to 1st byte: 663.80ms 4.04s 1.77s 1.11s 75.00%
req/s : 52.71 62.81 57.99 2.16 71.00%
Worse than m=10
and with lots of failed requests.
docker run --rm -it --network=host h2load-http3 -m50 -c100 --duration=600 --npn-list h3 https://haproxy.io
finished in 600.04s, 2086.55 req/s, 1.40MB/s
requests: 1251928 total, 1256928 started, 1251928 done, 1156956 succeeded, 94972 failed, 0 errored, 0 timeout
status codes: 1157233 2xx, 0 3xx, 0 4xx, 94972 5xx
traffic: 842.06MB (882963252) total, 343.04MB (359702737) headers (space savings -5.84%), 491.86MB (515755272) data
UDP datagram: 354715 sent, 1956774 received
min max mean sd +/- sd
time for request: 43.53ms 33.43s 1.51s 2.38s 89.70%
time for connect: 49.88ms 1.12s 375.18ms 444.18ms 73.00%
time to 1st byte: 225.79ms 10.39s 3.45s 2.34s 73.44%
req/s : 0.00 39.82 20.87 15.91 56.00%
Worse than m=10
and with quite some failed requests.
docker run --rm -it --network=host h2load-http3 -m100 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.09s, 60.68 req/s, 46.81KB/s
requests: 37467 total, 44776 started, 37467 done, 35480 succeeded, 1987 failed, 1058 errored, 0 timeout
status codes: 35480 2xx, 0 3xx, 0 4xx, 929 5xx
traffic: 27.43MB (28760970) total, 8.61MB (9026606) headers (space savings 28.01%), 18.52MB (19418020) data
UDP datagram: 72017 sent, 81116 received
min max mean sd +/- sd
time for request: 40.41ms 73.59s 4.89s 13.20s 93.95%
time for connect: 48.85ms 248.64ms 149.19ms 59.13ms 58.00%
time to 1st byte: 261.89ms 4.79s 1.71s 819.79ms 82.54%
req/s : 0.00 64.01 9.31 13.15 93.00%
Way worse than m=50
. Lots of ERR_DRAINING
and ERR_CALLBACK_FAILURE
to note here.
docker run --rm -it --network=host h2load-http3 -m100 -c100 --duration=600 --npn-list h3 https://envoy.io
finished in 600.09s, 6284.10 req/s, 748.49KB/s
requests: 3770462 total, 3780462 started, 3770462 done, 245039 succeeded, 3525423 failed, 0 errored, 0 timeout
status codes: 245111 2xx, 0 3xx, 0 4xx, 3525464 5xx
traffic: 438.57MB (459872382) total, 30.05MB (31511292) headers (space savings 93.39%), 375.44MB (393677033) data
UDP datagram: 171388 sent, 483205 received
min max mean sd +/- sd
time for request: 61.39ms 12.32s 1.31s 1.45s 88.24%
time for connect: 78.20ms 5.22s 957.55ms 674.79ms 72.00%
time to 1st byte: 484.15ms 6.46s 1.76s 1.16s 72.00%
req/s : 55.18 70.63 62.84 2.91 70.00%
Similar to m=50
, but with a really high amount of failed requests.
docker run --rm -it --network=host h2load-http3 -m100 -c100 --duration=600 --npn-list h3 https://haproxy.io
finished in 600.06s, 2.63 req/s, 1.92KB/s
requests: 1579 total, 11579 started, 1579 done, 1579 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1579 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 1.13MB (1181092) total, 470.31KB (481595) headers (space savings -5.90%), 673.85KB (690023) data
UDP datagram: 2532 sent, 2589 received
min max mean sd +/- sd
time for request: 72.63ms 10.41s 2.37s 2.63s 85.75%
time for connect: 52.97ms 1.16s 372.71ms 452.19ms 75.00%
time to 1st byte: 1.35s 9.95s 3.62s 2.34s 91.67%
req/s : 0.00 19.01 0.41 2.05 95.00%
finished in 600.05s, 5.26 req/s, 3.88KB/s
requests: 3154 total, 13154 started, 3154 done, 3154 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 3229 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 2.27MB (2382292) total, 961.76KB (984845) headers (space savings -5.90%), 1.31MB (1378298) data
UDP datagram: 2419 sent, 4472 received
min max mean sd +/- sd
time for request: 137.99ms 8.10s 3.13s 1.79s 56.37%
time for connect: 60.54ms 3.07s 441.80ms 890.00ms 90.00%
time to 1st byte: 2.12s 6.01s 4.53s 981.35ms 60.00%
req/s : 0.00 6.27 0.88 1.76 82.00%
We get really low throughput or no results at all.
Increasing the number of max concurrent streams seems to have a negative impact on both Nginx and HAProxy. That was the case on Envoy (from 1 to 50), but then it seemed to stabilize. Given both HAProxy and Nginx started getting really low throughput, I think something is clearly off. I'm going to try more things (namely enabling GSO for Nginx. See more here).
Here's the summary:
Service | Protocol | M | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients |
---|---|---|---|---|---|---|
Nginx | HTTP/3 | 1 | 326.88ms | 0.00% | 8.53 | 100 |
Nginx | HTTP/3 | 10 | 1.25s | 1.00% | 9.39 | 100 |
Nginx | HTTP/3 | 50 | 2.22s | 3.00% | 9.67 | 100 |
Nginx | HTTP/3 | 100 | 1.71s | 6.00% | 9.31 | 100 |
Envoy | HTTP/3 | 1 | 448.12ms | 0.00% | 11.16 | 100 |
Envoy | HTTP/3 | 10 | 899.76ms | 1.00% | 14.92 | 100 |
Envoy | HTTP/3 | 50 | 1.77s | 92.00% | 57.99 | 100 |
Envoy | HTTP/3 | 100 | 1.76s | 94.00% | 62.84 | 100 |
HAProxy | HTTP/3 | 1 | 689.34ms | 0.00% | 14.24 | 100 |
HAProxy | HTTP/3 | 10 | 948.17ms | 0.00% | 17.22 | 100 |
HAProxy | HTTP/3 | 50 | 3.45s | 8.00% | 20.87 | 100 |
HAProxy | HTTP/3 | 100 | 4.53s | 0.00% | 0.88 | 100 |
The surprisingly high throughput for Envoy is explained by the substantial amount of failed requests (which was consistent over multiple runs). I did not try increasing concurrent clients, as I assumed that would further degrade the results (given more clients * more streams).
It looks like all solutions deal poorly with an increasing amount of max concurrent streams. Nginx seems to be the most reliable in terms of throughput and TTFB in this scenario. Given, I'm using different backends (i.e. httpd for nginx and simplehttp2server for both envoy/haproxy), I'll try to assess if there is a backend limitation going on here.
Note this was the case for nginx from the start.
docker run --rm -it --network=host h2load-http3 -m1 -c100 --duration=600 --npn-list h3 https://envoy.io
finished in 600.04s, 885.30 req/s, 565.02KB/s
requests: 531182 total, 531282 started, 531182 done, 531175 succeeded, 7 failed, 0 errored, 0 timeout
status codes: 531179 2xx, 0 3xx, 0 4xx, 7 5xx
traffic: 331.07MB (347150046) total, 4.13MB (4329421) headers (space savings 94.26%), 320.66MB (336235073) data
UDP datagram: 1315737 sent, 1825199 received
min max mean sd +/- sd
time for request: 44.29ms 828.50ms 112.87ms 25.00ms 76.08%
time for connect: 104.72ms 512.28ms 401.39ms 81.85ms 56.00%
time to 1st byte: 414.16ms 683.57ms 536.35ms 83.36ms 56.00%
req/s : 8.71 9.12 8.85 0.10 66.00%
docker run --rm -it --network=host h2load-http3 -m50 -c100 --duration=600 --npn-list h3 https://envoy.io
finished in 600.07s, 3736.18 req/s, 655.75KB/s
requests: 2241708 total, 2246708 started, 2241708 done, 326895 succeeded, 1914813 failed, 0 errored, 0 timeout
status codes: 326925 2xx, 0 3xx, 0 4xx, 1914819 5xx
traffic: 384.23MB (402890993) total, 17.24MB (18081472) headers (space savings 93.14%), 346.37MB (363195742) data
UDP datagram: 188617 sent, 469440 received
min max mean sd +/- sd
time for request: 45.12ms 11.28s 1.33s 1.60s 82.66%
time for connect: 74.50ms 2.75s 993.31ms 613.24ms 31.00%
time to 1st byte: 555.01ms 6.67s 2.56s 1.48s 69.00%
req/s : 33.51 42.47 37.36 1.63 70.00%
docker run --rm -it --network=host h2load-http3 -m1 -c100 --duration=600 --npn-list h3 https://haproxy.io
finished in 600.06s, 1267.84 req/s, 984.31KB/s
requests: 760706 total, 760806 started, 760706 done, 760706 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 760706 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 576.75MB (604761270) total, 116.07MB (121712960) headers (space savings -3.90%), 456.32MB (478484074) data
UDP datagram: 786631 sent, 1587185 received
min max mean sd +/- sd
time for request: 40.77ms 513.23ms 58.31ms 14.26ms 84.27%
time for connect: 55.04ms 1.11s 394.40ms 448.80ms 71.00%
time to 1st byte: 175.01ms 1.20s 477.20ms 400.47ms 77.17%
req/s : 0.00 17.66 13.71 6.15 80.00%
docker run --rm -it --network=host h2load-http3 -m50 -c100 --duration=600 --npn-list h3 https://haproxy.io
finished in 600.05s, 67.75 req/s, 52.60KB/s
requests: 40651 total, 45651 started, 40651 done, 40651 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 40651 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 30.82MB (32317545) total, 6.20MB (6504160) headers (space savings -3.90%), 24.38MB (25569479) data
UDP datagram: 7551 sent, 48259 received
min max mean sd +/- sd
time for request: 42.83ms 18.39s 346.95ms 1.64s 97.34%
time for connect: 56.36ms 1.07s 172.90ms 207.51ms 95.00%
time to 1st byte: 224.63ms 18.19s 3.28s 4.15s 90.22%
req/s : 0.00 100.55 7.35 17.67 91.00%
Switched everyone to have httpd (http/1.1) upstream, so the test scenario got absolutely even. No big improvements here, as Nginx still fares best. Envoy fails a lot of requests and this could be config related (might be worth looking into).
Service | Protocol | M | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients |
---|---|---|---|---|---|---|
Nginx | HTTP/3 | 1 | 326.88ms | 0.00% | 8.53 | 100 |
Nginx | HTTP/3 | 50 | 2.22s | 3.00% | 9.67 | 100 |
Envoy | HTTP/3 | 1 | 536.35ms | 1.00% | 8.85 | 100 |
Envoy | HTTP/3 | 50 | 2.56s | 86.00% | 37.36 | 100 |
HAProxy | HTTP/3 | 1 | 477.20ms | 0.00% | 13.71 | 100 |
HAProxy | HTTP/3 | 50 | 3.28s | 0.00% | 7.35 | 100 |
Decided to verify if enabling GSO would affect performance values for maximum concurrent streams given the poor results. The idea came from this blog post.
For now, I only tried nginx as it was the best performer (also the most comparable to itself as I only used one backend for it in these benchmarks). See the posts above.
I enabled GSO with ethtool -K eth2 gso on
. Also changed nginx config as shown here.
rmem
tweaksI ran this: sudo sysctl -w net.core.rmem_max=26214400 && sudo sysctl -w net.core.rmem_max=26214400
. This way, I could test both changes and see if they provided a better cumulative gain.
docker run --rm -it --network=host h2load-http3 -m1 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.08s, 166.53 req/s, 131.01KB/s
requests: 99921 total, 100000 started, 99921 done, 99921 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 76.76MB (80491700) total, 24.13MB (25300000) headers (space savings 27.92%), 51.78MB (54300000) data
UDP datagram: 224465 sent, 300059 received
min max mean sd +/- sd
time for request: 64.78ms 3.44s 123.58ms 53.83ms 71.65%
time for connect: 52.47ms 254.05ms 153.00ms 59.17ms 57.00%
time to 1st byte: 254.18ms 417.71ms 331.06ms 45.03ms 53.00%
req/s : 7.87 8.50 8.08 0.11 73.00%
docker run --rm -it --network=host h2load-http3 -m10 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.04s, 166.66 req/s, 130.98KB/s
requests: 99999 total, 100135 started, 99999 done, 99968 succeeded, 31 failed, 3 errored, 0 timeout
status codes: 99972 2xx, 0 3xx, 0 4xx, 28 5xx
traffic: 76.75MB (80474425) total, 24.12MB (25294428) headers (space savings 27.92%), 51.77MB (54289216) data
UDP datagram: 228480 sent, 266165 received
min max mean sd +/- sd
time for request: 43.38ms 61.29s 1.11s 1.31s 97.68%
time for connect: 51.56ms 249.94ms 151.07ms 58.75ms 57.00%
time to 1st byte: 267.12ms 4.37s 1.07s 869.09ms 85.00%
req/s : 8.68 9.57 8.96 0.16 71.00%
docker run --rm -it --network=host h2load-http3 -m50 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.08s, 116.19 req/s, 90.20KB/s
requests: 70143 total, 71930 started, 70143 done, 68529 succeeded, 1614 failed, 432 errored, 0 timeout
status codes: 68529 2xx, 0 3xx, 0 4xx, 1182 5xx
traffic: 52.85MB (55418715) total, 16.60MB (17401665) headers (space savings 27.98%), 35.67MB (37401227) data
UDP datagram: 142350 sent, 160535 received
min max mean sd +/- sd
time for request: 41.36ms 81.46s 3.07s 10.53s 96.20%
time for connect: 55.24ms 254.08ms 153.21ms 59.11ms 58.00%
time to 1st byte: 265.83ms 26.89s 2.17s 2.89s 98.85%
req/s : 0.00 16.94 10.06 5.44 75.00%
docker run --rm -it --network=host h2load-http3 -m100 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.08s, 59.65 req/s, 46.47KB/s
requests: 37177 total, 44882 started, 37177 done, 35352 succeeded, 1825 failed, 1389 errored, 0 timeout
status codes: 35352 2xx, 0 3xx, 0 4xx, 436 5xx
traffic: 27.23MB (28548762) total, 8.55MB (8967600) headers (space savings 27.96%), 18.38MB (19268936) data
UDP datagram: 66069 sent, 75999 received
min max mean sd +/- sd
time for request: 42.62ms 68.27s 3.06s 8.27s 94.13%
time for connect: 53.07ms 252.59ms 151.83ms 59.19ms 58.00%
time to 1st byte: 266.24ms 15.73s 1.80s 1.89s 98.36%
req/s : 0.00 63.25 10.14 12.60 85.00%
rmem
tweaksJust to make sure tweaking both GSO and rmem
produces a better outcome than just the latter.
docker run --rm -it --network=host h2load-http3 -m100 -c100 --duration=600 --npn-list h3 https://nginx.io
finished in 600.09s, 69.50 req/s, 53.15KB/s
requests: 43365 total, 50839 started, 43365 done, 40140 succeeded, 3225 failed, 1665 errored, 0 timeout
status codes: 40140 2xx, 0 3xx, 0 4xx, 1560 5xx
traffic: 31.15MB (32657987) total, 9.77MB (10239660) headers (space savings 28.06%), 21.03MB (22052790) data
UDP datagram: 81077 sent, 91223 received
min max mean sd +/- sd
time for request: 42.66ms 66.22s 4.03s 11.86s 94.39%
time for connect: 56.00ms 7.07s 1.45s 1.62s 73.00%
time to 1st byte: 154.81ms 8.09s 2.01s 1.40s 85.45%
req/s : 0.00 39.86 8.70 10.17 84.00%
Didn't explore this further as this result seemed to indicate worse performance.
Service | Protocol | M | TTFB Mean | Failure Rate | Reqs/S | Concurrent Clients | Tweaks |
---|---|---|---|---|---|---|---|
Nginx | HTTP/3 | 1 | 331.06ms | 0.00% | 8.08 | 100 | GSO+rmem |
Nginx | HTTP/3 | 1 | 326.88ms | 0.00% | 8.53 | 100 | rmem |
Nginx | HTTP/3 | 10 | 1.07s | 1.00% | 8.96 | 100 | GSO+rmem |
Nginx | HTTP/3 | 10 | 1.25s | 1.00% | 9.39 | 100 | rmem |
Nginx | HTTP/3 | 50 | 2.17s | 3.00% | 10.16 | 100 | GSO+rmem |
Nginx | HTTP/3 | 50 | 2.22s | 3.00% | 9.67 | 100 | rmem |
Nginx | HTTP/3 | 100 | 1.71s | 6.00% | 9.31 | 100 | rmem |
Nginx | HTTP/3 | 100 | 1.80s | 5.00% | 10.14 | 100 | GSO+rmem |
Nginx | HTTP/3 | 100 | 2.01s | 8.00% | 8.70 | 100 | GSO |
Results seem better for GSO+rmem, but the differences aren't enough to exclude sampling error. I find this inconclusive, but if I had to choose I would take both tweaks.
http3 could be a huge, quick bang for our buck to significantly improve ttfb without adding more nodes/PoPs
adding http3 support should be broken into three pieces:
[x] do a quick benchmark of nginx's tcp+tls vs http3. for example, fire up an aws ec2 instance far away and run two web servers there, one standard tcp+tls with nginx and the other with quic/http3. then benchmark how long it takes to establish connections to each of those web servers in a browser that supports http3, like chrome (https://caniuse.com/http3)
http3 should be much faster. but is it? this can also serve as a quick test-bed of the variegated http3 tools, libraries, and software that exist right now. see below
ping @DiegoRBaquero to get a production ssl cert to use
[x] understand the architecture of the l1 and investigate the http3 landscape to determine the best tool, library, or software to add http3 support to the l1. various tools:
caddy -> nginx -> l1 shim
litespeed -> nginx -> l1 shim
. fwiw according to https://w3techs.com/technologies/segmentation/ce-quic/web_server the majority of quic traffic online is currently served from litespeed. though if cloudflare supports http3, which it supposedly does, these stats dont make sense unless cloudflare uses litespeed. which they dontdiscuss viable avenues for http3 implementation with @gruns and @DiegoRBaquero
note here that, for simplicity, any non-nginx http3 implementation will likely be replaced with nginx's native http3 support once ready, so long as nginx's implementation suffices. so we shouldn't get too crazy adding http3 support wrt the amount of time, effort, or complexity undertaken here
[x] once http3 has shown itself superior and we've determined the implementation battleplan, add http3 support to the l1 node and ship it, baby 🚀