Instead of allocating memory in multiple occasions during its lifetime, the WebSocket struct uses the heap memory of WebSocketBuffer that is passed to its constructor. For example, in a CPU with 8 threads we have at most 8 concurrent tasks at the same time, as such, a pool of 8~12 static buffers synchronized by a Mutex could make a program run faster because memory is allocated only once and re-utilized by other instances.
Surprisingly (at least for me), a brand new allocation of WebSocketBuffer in every instantiation of WebSocket is faster than using static pools in autobahn cases (9.7.6, 12.1.8, etc) as well as in wtx-bench scenarios (https://c410-f3r.github.io/wtx-bench/).
Unless the heap can allocate faster than synchronization mechanisms, the problem is likely within the design of SimplePool.
The intention behind a pool of buffers is to avoid memory allocation and limit excessive hardware utilization. For example, that is what happens in the opinionated WebSocket server (https://github.com/c410-f3r/wtx/blob/main/wtx/examples/web-socket-server-raw-tokio-rustls.rs).
Instead of allocating memory in multiple occasions during its lifetime, the
WebSocket
struct uses the heap memory ofWebSocketBuffer
that is passed to its constructor. For example, in a CPU with 8 threads we have at most 8 concurrent tasks at the same time, as such, a pool of 8~12 static buffers synchronized by a Mutex could make a program run faster because memory is allocated only once and re-utilized by other instances.Surprisingly (at least for me), a brand new allocation of
WebSocketBuffer
in every instantiation ofWebSocket
is faster than using static pools inautobahn
cases (9.7.6, 12.1.8, etc) as well as inwtx-bench
scenarios (https://c410-f3r.github.io/wtx-bench/).Unless the heap can allocate faster than synchronization mechanisms, the problem is likely within the design of
SimplePool
.