Open KernelPryanic opened 6 years ago
I just published a small webwire benchmarking tool.
go run test-server.go
go run benchmark.go
Following parameters are available:
bench-dur
: benchmark duration in seconds (default: 10
)addr
: address of the target test server like localhost:80
(default: :8081
)clients
: number of concurrent clients (default: 10
)req-timeo
: default request timeout (default: 10000
)min-req-itv
: min interval between each request in milliseconds (default: 250
)max-req-itv
: max interval between each request in milliseconds (default: 500
)min-pld-sz
: min request payload size in bytes (default: 32
)max-pld-sz
: max request payload size in bytes (default: 128
)Here's an example of a 60 seconds long benchmark with 1,000 concurrent connections each sending requests with a 1 KiB payload in a 10 to 30 milliseconds interval:
go run benchmark.go -clients 1000 -min-req-itv 10 -max-req-itv 30 -min-pld-sz 1024 -max-pld-sz 1024 -req-timeo 60000 -bench-dur 60
And here's the results of the above benchmark:
2018/04/02 21:20:19 Benchmark finished (60s)
Requests performed: 1892900
Requests timed out: 0
Data sent: 1.81 GiB (1938329600 bytes)
Data received: 1.81 GiB (1938329600 bytes)
Avg payload size: 1.00 KiB
Avg req itv: 19.955008ms
Max req itv: 29ms
Min req itv: 10ms
Avg req time: 9.420078ms
Max req time: 832.1403ms
Min req time: 1.0004ms
Req/s: 31548
Bytes/s: 32305493
Throughput: 30.81 MiB/s
System: I7 3930K hexa-core @ 3.8 Ghz; 64,0 GB DDR3 RAM @ 1833 Mhz
As you can see I was currently able to achieve around 31,5k requests per second with an average reply time of 9 milliseconds at 1k concurrent clients
The benchmark is running amok on Windows 10 in case of many concurrent connections.
It seems like TCP/IP connection establishment is very slow on Windows causing huge problems when creating many concurrent connections (> 1000). Too many connections are invoking ridiculously many syscalls on Windows resulting in the Go runtime spawning thousands of OS threads because of syscall-blocked goroutines rendering the machine unresponsive when reaching 10k threads.
In the above screenshot, trace
demonstrates the ridiculous amount of syscalls, the slowly degrading performance and the ever growing number of spawned OS threads.
I've also tested the same configuration on MacOS High Sierra getting very different results:
The Mac performed just fine with only 27 OS threads. No degrading performance, no syscall spam.
It look more like a Windows related problem rather than a WebWire server/client problem.
I performed a load test using the latest revision and got the following results:
Concurrent Connections | 10.000 |
Request Payload | 1 - 64 KiB |
Requests Performed | 5.919.046 |
Timeout Rate | 0.00% |
Sent | 183.44 GiB |
Received | 183.44 GiB |
Throughput | 313.07 MiB/s |
Requests per Second | 9.865 rps |
Average Latency | 1 millisecond |
Maximum Latency | 4,23 seconds |
Intel i7 3930K (12 threads @ 3.8Ghz, reached full load at 72°C) 64 GB DDR3 1833 Mhz (around 4,75 GB were used during the benchmark)
Consider that both the benchmark and server ran on this machine distorting the results, which could potentially be higher if those were run on different servers.
Are there already any performance benchmarking results available?