Closed ioquatix closed 5 years ago
For poor accuracy, check its output formats - nothing below millisecond timing is reported at all. For the kind of testing RSB does, that kind of accuracy loss is completely unacceptable, and I need more specific timing than straight-up iterations/second for a lot of what I do.
For bugs, I'll start by citing Phusion Passenger's benchmarking recommendations, which says the same but is unspecific: https://www.phusionpassenger.com/library/config/nginx/optimization/#benchmarking-recommendations
I've also had trouble with its KeepAlive mode. Separately, Charles Nutter and Tom Enebo have reported (pers. comm) AB's KeepAlive is normally HTTP 1.0 only, which means it's underspecified and quirky (the HTTP 1.0 KeepAlive spec isn't good), but you also can't easily switch it to the better-specified, better-supported HTTP 1.1 KeepAlive behavior.
I personally use ab
as well as wrk
and never had any general problems with either.
I've heard that Puma has bugs handling Connection: keep-alive
. But I'm not sure why, since it's a fairly straight forward header to support. e.g. https://github.com/puma/puma/issues/1565
nothing below millisecond timing is reported at all
That's actually a legitimate concern - the only one here - and I'd suggest you just keep to specifics. Because I personally find ab
a really great tool for testing the trade-off between persistent and non-persistent connections. Yes, it's not perfect, but what tool is?
Looking at passenger documentation:
Enable HTTP keep-alive in both the server and in your benchmarking tool. Otherwise you will end up benchmarking how quickly the kernel can set up TCP connections, which is a non-trivial part of the request time.
Yes, that may be true, but you also suffer more because of the design of Passenger - using Nginx as a proxy it requires establishing multiple connections and the overhead per connection is pretty big, by the design choices they made. So logically, it makes sense that they'd recommend to avoid this kind of benchmark.
I can't tell you why Puma has trouble with KeepAlive, but I've been at a large table full of Puma (and JRuby and various other) developers trying to track it down, and it seems to be a nontrivial issue. While the final fix may be something easy, finding that "something easy" does not seem to be straightforward.
I'm not suggesting that Phusion is 100% unbiased (nobody is.) But if I've found odd bugs with AB and they've found odd bugs with AB and the JRuby guys have found odd bugs with AB, maybe AB has some bugs?
I am assuming that when you say "I'd suggest you just keep to specifics" and declare the other concerns illegitimate you mean "please remove the idea that ApacheBench is buggy from your README", presumably based on, as you say, the fact that you "personally find ab a really great tool for testing the trade-off between persistent and non-persistent connections." I'm not sure how to respond to that, though leaving this bug here as documentation of somebody disagreeing with me is one possibility.
I'm also planning to add a patch for a better KeepAlive mode to wrk, in order to get a tool I find really great for testing that tradeoff. I'm not wild about how either wrk or AB does it currently, and the wrk code for that looks easy to patch. But that's not really what you're asking about here.
When I saw the comment that ab
was buggy with no citation, I wondered what was wrong with it. Yes, it's old. Yes, it's HTTP/1.0. I don't say the other concerns are illegitimate, but that I don't have enough experience and you don't provide enough evidence. What kind of message are you trying to send with such a statement?
I wondered if you'd only tested ab
with puma
- have the same bugs occurred in some other server? If it's puma
that's buggy and not ab
, isn't it a bit rough to perpetuate the same story as Phusion with little/no evidence? (they also don't provide any evidence but thanks for the link).
I personally use both ab
and wrk
in my specs and they have caught different issues at different times, both performance and protocol regressions. So, I have respect for those tools having a place :) I'll leave it up to you whether you want to do anything about it, but unless you start fielding lots of questions about why you didn't use ab
(which I can hardly imagine is a problem), I'd suggest just removing that statement.
I was just taking a look at the output of ab
since it's part of my standard test suite:
Server Software:
Server Hostname: 127.0.0.1
Server Port: 9294
Document Path: /
Document Length: 0 bytes
Concurrency Level: 8
Time taken for tests: 0.286 seconds
Complete requests: 1600
Failed requests: 0
Total transferred: 107200 bytes
HTML transferred: 0 bytes
Requests per second: 5598.50 [#/sec] (mean)
Time per request: 1.429 [ms] (mean)
Time per request: 0.179 [ms] (mean, across all concurrent requests)
Transfer rate: 366.31 [Kbytes/sec] received
It does seem to include sub-ms precision.
It includes sub-ms precision for the mean across all requests - that is, it gives one sub-ms precision measurement across all requests, which doesn't allow for checking other measurements (percentiles, variance, etc.)
It's possible to have it output the timing for more than a single by specifying one of its two output formats (CSV, GNUplot.) However, those output formats are both only ms-accurate.
It would be possible to run ab once for each request, but then I lose the low overhead which was the reason to not just use RestClient in Ruby.
Citation?