[WebSocket] The use of pools slows down benchmarks

The intention behind a pool of buffers is to avoid memory allocation and limit excessive hardware utilization. For example, that is what happens in the opinionated WebSocket server (https://github.com/c410-f3r/wtx/blob/main/wtx/examples/web-socket-server-raw-tokio-rustls.rs).

Instead of allocating memory in multiple occasions during its lifetime, the WebSocket struct uses the heap memory of WebSocketBuffer that is passed to its constructor. For example, in a CPU with 8 threads we have at most 8 concurrent tasks at the same time, as such, a pool of 8~12 static buffers synchronized by a Mutex could make a program run faster because memory is allocated only once and re-utilized by other instances.

Surprisingly (at least for me), a brand new allocation of WebSocketBuffer in every instantiation of WebSocket is faster than using static pools in autobahn cases (9.7.6, 12.1.8, etc) as well as in wtx-bench scenarios (https://c410-f3r.github.io/wtx-bench/).

Unless the heap can allocate faster than synchronization mechanisms, the problem is likely within the design of SimplePool.

c410-f3r / wtx

[WebSocket] The use of pools slows down benchmarks #223