guzba / mummy

An HTTP and WebSocket server for Nim that returns to the ancient ways of threads.
MIT License
281 stars 11 forks source link

wrk_mummy using 100 threads which is unfair to httpbeast #83

Closed bung87 closed 1 year ago

bung87 commented 1 year ago

I run wrk tests on my m1 macbook, both around 7k qps, while I changed wrk_mummy to workerThreads = countProcessors() which is httpbeast default setting, it goes to about 600 qps.

by reading the readme, there're some pros over httpbeast, I expected mummy has better performance than httpbeast, but even 100 threads over httpbeast default settings, it still has lower performance.

treeform commented 1 year ago

Httpbeast and Mummy use threads differently, so they can't be compared directly.

Httpbeast uses an async dispatch system to essentially simulate async style green threads. Nim's async dispatch system is single-threaded. What Httpbeast does is simply start multiple threads, each with an independent async dispatch system. Httpbeast needs to start only as many threads as there are cores.

On the other hand, Mummy uses a single IO thread and multiple worker threads to handle requests. Limiting Mummy to only a few threads will significantly limit its performance.

Setting Mummy to use only 8 threads, for example, is like setting Httpbeast to only allow 8 concurrent requests.

I actually expect Httpbeast to win in pure IO or synthetic benchmarks, where Mummy really wins is in hybrid tasks which require calling the OS, such as file reads, db calls, DNS resolution, CPU work together with IO work, you know the real world? And the readme even states that Httpbeast, and things based on it like Jester and Prologue do win by a little bit.

bung87 commented 1 year ago

Thanks for the explanation, after increasing httpbeast number of threads did not seem to affect the result, I understand now.

guzba commented 1 year ago

Thanks for taking a look at Mummy @bung87 and thanks @treeform for such a quick and great answer.

I would note that 100 is kind of arbitrary, I've run it with 1k threads and that was fine too. 10k threads caused a system exception if I recall correctly. Either way, it's unlikely a single real-world server will have 1k database connections or whatever other thing is the actual limiting factor on performance so its kind of moot.