Open naturallymitchell opened 5 years ago
mitchell@archlinux ~> wrk -t4 -c400 -d10s http://localhost:8000 Running 10s test @ http://localhost:8000 4 threads and 400 connections Thread Stats Avg Stdev Max +/- Stdev Latency 339.37ms 44.35ms 629.22ms 87.93% Req/Sec 325.75 218.23 0.99k 69.31% 11443 requests in 10.07s, 16.85MB read Requests/sec: 1136.32 Transfer/sec: 1.67MB
mitchell@archlinux ~> wrk -t4 -c400 -d10s http://localhost:3000 Running 10s test @ http://localhost:3000 4 threads and 400 connections Thread Stats Avg Stdev Max +/- Stdev Latency 814.10ms 138.35ms 960.36ms 92.74% Req/Sec 168.05 151.22 585.00 69.44% 4670 requests in 10.04s, 1.71MB read Requests/sec: 465.23 Transfer/sec: 174.01KB
mitchell@archlinux ~> wrk -t4 -c40 -d10s http://localhost:3000 Running 10s test @ http://localhost:3000 4 threads and 40 connections Thread Stats Avg Stdev Max +/- Stdev Latency 84.49ms 13.21ms 126.02ms 68.53% Req/Sec 118.28 31.26 202.00 78.00% 4719 requests in 10.04s, 1.72MB read Requests/sec: 470.09 Transfer/sec: 175.82KB
mitchell@archlinux ~> wrk -t4 -c40 -d10s http://localhost:8000 Running 10s test @ http://localhost:8000 4 threads and 40 connections Thread Stats Avg Stdev Max +/- Stdev Latency 37.13ms 7.56ms 78.00ms 76.98% Req/Sec 269.43 47.42 404.00 70.25% 10741 requests in 10.03s, 16.88MB read Requests/sec: 1071.05 Transfer/sec: 1.68MB
looks like 50% performance than bare Rust
What specs are you running it under because I did something completely different when testing them both
just on my laptop with lots of other things running, so just one rough measure
what tests and test results did you do/get?
I asked for the specs (mainly core/thread count) so I can emulate the environment to see if I get a similar result but here is what I get when running a benchmark with both of them
simple-webserver
wrk -t4 -c400 -d10s http://localhost:3000
Running 10s test @ http://localhost:3000
4 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 98.51ms 5.72ms 106.40ms 97.08%
Req/Sec 1.01k 20.24 1.08k 79.00%
40418 requests in 10.04s, 14.76MB read
Requests/sec: 4026.58
Transfer/sec: 1.47MB
http-server
wrk -t4 -c400 -d10s http://localhost:8000
Running 10s test @ http://localhost:8000
4 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 9.62ms 6.91ms 30.06ms 83.44%
Req/Sec 11.58k 303.71 12.80k 69.75%
460935 requests in 10.05s, 753.88MB read
Requests/sec: 45857.64
Transfer/sec: 75.00MB
Unable to emulate that cpu specifically (my workstation is a amd processor and my intel nuc is running other task at the moment) but using the core count (which is 2 but also tested with one as well) I did get the following
simple-webserver (2 cpu)
./wrk -t4 -c400 -d10s http://localhost:3000
Running 10s test @ http://localhost:3000
4 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 102.12ms 4.66ms 126.10ms 93.50%
Req/Sec 0.98k 74.23 1.13k 90.75%
38978 requests in 10.03s, 11.34MB read
Requests/sec: 3884.37
Transfer/sec: 1.13MB
simple-webserver (1 cpu)
./wrk -t4 -c400 -d10s http://localhost:3000
Running 10s test @ http://localhost:3000
4 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 168.40ms 12.00ms 192.68ms 96.74%
Req/Sec 595.17 294.16 1.01k 60.78%
23570 requests in 10.04s, 8.43MB read
Requests/sec: 2348.73
Transfer/sec: 860.13KB
http-server (2 cpu)
./wrk -t4 -c400 -d10s http://localhost:8000
Running 10s test @ http://localhost:8000
4 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 27.80ms 12.11ms 97.10ms 73.96%
Req/Sec 3.61k 289.44 4.53k 75.50%
143872 requests in 10.05s, 144.20MB read
Requests/sec: 14318.07
Transfer/sec: 14.35MB
http-server (1 cpu)
./wrk -t4 -c400 -d10s http://localhost:8000
Running 10s test @ http://localhost:8000
4 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 49.88ms 2.84ms 109.38ms 94.20%
Req/Sec 2.01k 106.56 2.44k 84.75%
80032 requests in 10.03s, 80.22MB read
Requests/sec: 7979.13
Transfer/sec: 8.00MB
(I left the threads as is since even if I did change it to match the core count, the results wouldve been pretty much the same) Though the processor you have tested on is a mobile processor so ymmv in terms of that, but regardless, but in a real world case (and not a straight shot benchmark) performance may vary, but this still do show there is some overhead when it comes to scaling
yep.. this looks like extreme overhead
I think we're gonna need a constellation of builds
would you please try a pared down build with only web server and fs bindings to see how that optimized build performs?
simple-webserver vs http-server-rs
to test torchbear's overhead