Closed ry closed 1 year ago
Hmm, this would require some investigation on our side. It used to be the fastest library at the time we initially released it as far as I can remember, but things might have changed in the meantime. Or, maybe we are as performant (or close to the performance of the competitors), but something fishy is going on.
I have not done any tests with the benchmarking tool in question, but here are some wild guesses:
tokio-tungstenite
? What if we compare it with a regular Tokio runtime? What's then the difference between uWebSockets
and tokio-tungstenite
?uWebSockets
use permessage-deflate for the messages? - tokio-tungstenite
does not support it yet, there is an open issue for that.uWebSockets
may have a more sophisticated buffer management that beats tokio-tungstenite
in this particular case.Dumb question: did you compare the release builds of both executables?
UPD: Some differences I can see in the code, the echo server of uWebSockets
uses some compression options, while tokio-tungstenite
does not support it yet.
Note, I've noticed in your comment to the code that:
// Sent 115255 messages in 1 sec, throughput: 576275 bytes/sec
Whereas on the chart the throughput is much lower than that. Is there a typo in a comment or rather in a chart?
Yes both are release builds. I think that comment is out-of-date.
We're comparing single threaded Tokio because in Deno, where we use this, we use the single thread runtime. In general an apples-to-apples comparison to uWebSocket, it is appropriate to use just a single thread.
We'll investigate the compression options. Good find. Thank you.
It seems like Deno folks decided to write a new performance-focused WebSocket library, so I think this issue may be closed.
However, this question has been asked recently, so I've just included some notes on performance in the README with a link to the relevant comment which summarises the improvements that tungstenite
must implement in order to get closer to fastwebsockets
in terms of performance.
We are comparing echo server throughput with uWebSocket with the current_thread runtime, 256 connections, and seeing a notable 30% throughput difference.
Here's the source code we're using:
We're comparing it to https://github.com/uNetworking/uWebSockets/blob/45e9ca2372a3758ece3384ac52797da3a3c8fa48/examples/EchoServer.cpp
And we're using this tool to benchmark it: https://github.com/uNetworking/uWebSockets/blob/45e9ca2372a3758ece3384ac52797da3a3c8fa48/benchmarks/load_test.c
Any ideas what is causing this suboptimal performance?