Look into better ways for measuring performance.

bhansconnect commented 4 years ago

wrk is an ok tool for measuring throughput, but it can be misleading. The issue of only looking at throughput is that you ignore latency. So a server may have a higher throughput than another but at a much worse latency.

I think instead of measuring throughput, you get much nicer and more meaningful results when you try to figure out the max requests per second a framework can handle while also staying below a specific latency. For example, max req/s with at most 300ms 99.9% latency. This ensures that all the frameworks are competing against the same bar and gives more meaning to the results.

If you want more details on why you should measure latency with throughput, here are a few good articles: Your Load Generator Is Probably Lying To You and Everything You Know About Latency Is Wrong. Also, this is an amazing talk on the issue: How NOT to Measure Latency.

I would suggest that this repo look into defining a latency goal and then measuring the max throughput a framework can get to while meeting the goals. To go along with that, I would suggest that the the repo use either wrk2 or vegeta for measuring. Vegeta tends to get better and more accurate results, but is more resource intensive. It is more likely to cause the client computer to run out of resources when trying to benchmark. I also wrote a wrapper for vegeta that simply does a search for the max throughput that is possible for a specific latency target. It is called vegeta-break.

waghanza commented 4 years ago

Hi @bhansconnect,

It's quite easy to change of sieger.

However, the idea is to gather some feedback about tools that we can use.

Could you please summarize some tools and arguments ?

I'll do my best to test all sieger in this thread :

In my opinion, req/s column SHOULD reply to the maximum (successful) req/s I could have per framework

bhansconnect commented 4 years ago

If the only goal is to discover the successful maximum req/s(which to be fair is a nice and simple metric). Then a number of the tools are not useful because the focus on measuring latency at a specific req/s. Namely, vegeta and wrk2 are focused more on latency and measuring the application at a specific req/s. Personally I think excluding latency is a mistake, but I understand if it is not within the scope of this repo.

Another set of tools are not really that useful because they focus on more complex sites and interactions: Gatling, Drill, and JMeter. Though these tools can be used to test a single endpoint, they are complete overkill to do so, and generally slower than the tools make to test a single endpoint.

Since you only care about max req/s, it is extremely important that you set up your sieger in a way that enables the best out of each framework. For example, test at a few different concurrency levels because different frameworks will preform different based on concurrency(especially sync vs async frameworks). Also, would be good to test the frameworks with various numbers of threads, but that is fine tuning of the frameworks.

As for tools, most of them copy the interface of apache bench. With that in mind, I would run the exact same test with all of the siegers that are similar to apache bench. Spin up the best framework that you know of on one machine, and on another use the tool to see how many req/s it gets. Simply pick the sieger that records the highest req/s. I would advise running your tests on multicore systems to most accurately use threading.

There are a ton of siegers that are extremely similar to apache bench. Here are some of the more popular/performant ones to my knowledge:

the-benchmarker / web-frameworks

Look into better ways for measuring performance. #2182