Closed OrKoN closed 1 year ago
Yes, ws
does not strictly follow the WHATWG specification. I don't think this is a bug and to be honest I find ws
behavior less surprising.
To match the browser behavior only a single 'message'
event per microtask should be emitted. We could do this at the parser level, for example by using a Transform
stream instead of a Writable
stream, pausing it after each 'data'
event and resuming it after the queueMicrotask()
callback is called. However, this should be thought carefully as it might affect performance and have side effects and I'm not sure if it's worth it. I think there is not much code that depends on this behavior.
@KhafraDev fyi, the WebSocket implementation in https://github.com/nodejs/undici has the same non spec compliant behavior.
@OrKoN https://github.com/websockets/ws/pull/2160 should fix this but I did not run any benchmark and have not thought about possible side effects yet.
@lpinca thanks!
In puppeteer, we worked around the problem by scheduling all event handlers triggered by 'ws' using setImmediate
. We have not noticed any performance regressions (we don't handle a lot of traffic though).
Here are some benchmark results run under WSL 2.
20 bytes messages - websockets/ws#8.13.0
$ ./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 59225.000000
Msg/sec: 60525.500000
Msg/sec: 59986.500000
Msg/sec: 59474.250000
Msg/sec: 59477.250000
20 bytes messages - websockets/ws#fix/issue-2159
$ ./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 52816.500000
Msg/sec: 54566.250000
Msg/sec: 54582.500000
Msg/sec: 54458.750000
Msg/sec: 54205.500000
1024 bytes messages - websockets/ws#8.13.0
$ ./load_test 32 127.0.0.1 8080 0 0 1024
Using message size of 1024 bytes
Running benchmark now...
Msg/sec: 49375.750000
Msg/sec: 49289.250000
Msg/sec: 49235.000000
Msg/sec: 48787.750000
Msg/sec: 49216.750000
1024 bytes messages - websockets/ws#fix/issue-2159
$ ./load_test 32 127.0.0.1 8080 0 0 1024
Using message size of 1024 bytes
Running benchmark now...
Msg/sec: 47471.500000
Msg/sec: 46715.250000
Msg/sec: 46860.250000
Msg/sec: 47072.250000
Msg/sec: 47157.500000
load_test
is this tool: https://github.com/uNetworking/uWebSockets/blob/master/benchmarks/load_test.c. The server is a simple echo server.
import { WebSocketServer } from 'ws';
const server = new WebSocketServer({ port: 8080 }, function () {
console.log('Server listening on *:8080');
});
server.on('connection', function (ws) {
ws.on('message', function (data, isBinary) {
ws.send(data, { binary: isBinary });
});
});
The impact for small messages (20 bytes) is noticeable (~10%).
On macOS with 250 clients I see no significant overhead. If you are using Linux on bare metal, may I ask you to run some benchmarks like the ones above (increasing the number of clients up to 1000 might make sense)? Thank you.
Unfortunately, I don't have a Linux on bare metal.
This is the numbers I get on an Intel Macbook (node 20):
# 8.13.0
./load_test 32 127.0.0.1 8080 0 0 1024
Using message size of 1024 bytes
Running benchmark now...
Msg/sec: 34488.250000
Msg/sec: 34700.750000
Msg/sec: 34885.500000
Msg/sec: 34716.750000
Msg/sec: 31545.000000
Msg/sec: 34272.500000
Msg/sec: 34531.500000
# fix
./load_test 32 127.0.0.1 8080 0 0 1024
Using message size of 1024 bytes
Running benchmark now...
Msg/sec: 36548.500000
Msg/sec: 33161.750000
Msg/sec: 32867.000000
Msg/sec: 32287.500000
Msg/sec: 32432.750000
Msg/sec: 31692.500000
Msg/sec: 32399.750000
Msg/sec: 32129.500000
# 8.13.0
./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 46923.750000
Msg/sec: 42516.000000
Msg/sec: 41810.500000
Msg/sec: 42102.500000
Msg/sec: 42115.000000
Msg/sec: 42091.000000
# fix
./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 38408.500000
Msg/sec: 38996.250000
Msg/sec: 39031.250000
Msg/sec: 39019.750000
Msg/sec: 38887.500000
Msg/sec: 37871.500000
Msg/sec: 38127.000000
It seems to be consistent with my results (~10% impact).
We can make this behavior optional by using and option that can be passed to the server and the client like the maxPayload
option but
Another option is an env variable like WS_NO_BUFFER_UTIL
and WS_NO_UTF_8_VALIDATE
but I'm also not very happy with it.
My opinion would be that it's better to match the spec by default and have performance optimisations as an opt-in. Also, I think the change does not have to apply to the server implementation as only the client-side events are spec'ed. I assume that clients handle fewer messages than servers so it might have lower impact on performance and would match the browser's implementations. Maybe if the change applies only to the clients, there will be no need to configure this behaviour?
My opinion would be that it's better to match the spec by default and have performance optimisations as an opt-in.
I'm not sure I agree. The current behavior is not a bug and there are other areas where we diverge from the WHATWG spec. A client in another language with a completely different interface is still a valid client as long as it follows RFC 6455, RFC 7692, etc. We try to be compatible with the WHATWG spec for convenience.
Also, I think the change does not have to apply to the server implementation as only the client-side events are spec'ed.
The same WebSocket
class is used by both the server and the client. We can make only WebSocket
clients have this behavior but it does not make much sense in my opinion. It would be inconsistent and confusing.
I'm not sure I agree. The current behavior is not a bug and there are other areas where we diverge from the WHATWG spec. A client in another language with a completely different interface is still a valid client as long as it follows RFC 6455, RFC 7692, etc. We try to be compatible with the WHATWG spec for convenience.
That makes sense to me too.
The same WebSocket class is used by both the server and the client. We can make only WebSocket clients have this behavior but it does not make much sense in my opinion. It would be inconsistent and confusing.
Yeah, might be confusing if both client and server are used (especially within one codebase). We so far only had the need for the client.
Yes, that is basically my point. In all these years this is the first time this is reported and as written in my first comment I don't think there is much code that depends on this behavior. This is why I would prefer to make it opt-in. On the other hand making it opt-in kind of defeats the compatibility goal as the code would not be portable. I wonder if there is some way to reduce the performance impact.
Surprisingly this
diff --git a/lib/receiver.js b/lib/receiver.js
index c7dcf8a..f4497b4 100644
--- a/lib/receiver.js
+++ b/lib/receiver.js
@@ -165,17 +165,12 @@ class Receiver extends Writable {
// `WAIT_MICROTASK`.
this._loop = false;
- const next = () => {
+ // `queueMicrotask()` is a little slower and is not available in
+ // Node.js < 11.
+ Promise.resolve().then(() => {
this._state = GET_INFO;
this.startLoop(cb);
- };
-
- if (typeof queueMicrotask === 'function') {
- queueMicrotask(next);
- } else {
- // `queueMicrotask()` is not available in Node.js < 11.
- Promise.resolve().then(next);
- }
+ });
return;
}
is a little faster. I'll keep investigating.
Some numbers from a virtual Linux:
# main
./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 40121.250000
Msg/sec: 44508.750000
Msg/sec: 43910.750000
Msg/sec: 41611.250000
Msg/sec: 44852.750000
Msg/sec: 37529.750000
Msg/sec: 37837.500000
Msg/sec: 35302.250000
Msg/sec: 35914.750000
Msg/sec: 37568.250000
# https://github.com/websockets/ws/pull/2160
./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 42166.250000
Msg/sec: 43883.000000
Msg/sec: 39499.750000
Msg/sec: 40256.250000
Msg/sec: 44213.000000
Msg/sec: 40567.500000
Msg/sec: 36763.500000
Msg/sec: 36296.000000
Msg/sec: 38419.750000
Msg/sec: 35596.75000
I have quickly tried wrapping all emit calls in setImmediate
to schedule them for the next cycle and it seems to be performing better (with a caveat: I am not 100% I have done it correctly and done it only for emit('message')
)
Simply deferring the 'message'
event to the next iteration of the event loop does not stop the parser and this makes me a bit uncomfortable as it might have unintended consequences. Anyway, I noticed that the impact shrinks as the number of clients grows. If it is in the order of ~5%, it is acceptable.
Is there an existing issue for this?
Description
I am not sure this is a bug or rather a topic for discussion. I have noticed that there is a subtle difference between
ws
and the WebSockets standard. The standard says that when a message is received the agent has to queue a new task. This has an important side effect that the microtask queue will be processed after the task is over and before the next task is picked from the queue. This behaviour is important in the cases where the messages need to be processed in a specified order.This can be demonstrated by the following server code:
And the client code:
If you run this code in Node and in the browser, you should observe that the order of console messages is different. It's probably not feasible to change the behaviour of 'ws' as it could be a breaking change but I wonder if there could be some compatibility mode for this? I believe it's possible to workaround that using setTimeout or setImmediate but the downside is that the same code cannot be used across platforms. If I have overlooked an existing issue about this topic, I apologize.
ws version
8.13.0
Node.js Version
v20.3.1
System
System: OS: macOS 13.5 CPU: (10) arm64 Apple M1 Max Memory: 696.63 MB / 64.00 GB Shell: 5.9 - /bin/zsh
Expected result
The order of console messages is the same in Node and in the browser and matches the behaviour defined by the standard:
Actual result
The order of console messages in Node is different (microtasks are processed after all messages):
Attachments
No response