c410-f3r / wtx

A collection of different transport implementations and related tools focused primarily on web technologies.
https://c410-f3r.github.io/wtx
Apache License 2.0
191 stars 6 forks source link

[WebSocket] The use of pools slows down benchmarks #223

Open c410-f3r opened 1 month ago

c410-f3r commented 1 month ago

The intention behind a pool of buffers is to avoid memory allocation and limit excessive hardware utilization. For example, that is what happens in the opinionated WebSocket server (https://github.com/c410-f3r/wtx/blob/main/wtx/examples/web-socket-server-raw-tokio-rustls.rs).

Instead of allocating memory in multiple occasions during its lifetime, the WebSocket struct uses the heap memory of WebSocketBuffer that is passed to its constructor. For example, in a CPU with 8 threads we have at most 8 concurrent tasks at the same time, as such, a pool of 8~12 static buffers synchronized by a Mutex could make a program run faster because memory is allocated only once and re-utilized by other instances.

Surprisingly (at least for me), a brand new allocation of WebSocketBuffer in every instantiation of WebSocket is faster than using static pools in autobahn cases (9.7.6, 12.1.8, etc) as well as in wtx-bench scenarios (https://c410-f3r.github.io/wtx-bench/).

Unless the heap can allocate faster than synchronization mechanisms, the problem is likely within the design of SimplePool.