websockets / ws

Simple to use, blazing fast and thoroughly tested WebSocket client and server for Node.js
MIT License
21.32k stars 2.3k forks source link

Revisiting Invalid WebSocket frame: RSV1 must be clear #2109

Closed timschwab closed 1 year ago

timschwab commented 1 year ago

Is there an existing issue for this?

Description

There are a few other issues for this, but it seems that in all of them, the answer has been, "The client sent bad data, so the error is expected." However, after doing some testing with several standard web browsers, unless I am missing something very basic, this is difficult to believe.

My server code is dead simple:

const {WebSocketServer} = require("ws");
const port = 8080;

const server = new WebSocketServer({
    port: port
});

server.on('connection', function(sock) {
    sock.on('message', function(data) {
        console.log('--- message ---');
        console.log(data);
    });

    sock.on("error", function(error) {
        console.log("--- error ---");
        console.log(error);
    });

    console.log("--- connection ---");

    sock.send("hello");
});

console.log(`waiting for connections on port ${port}...`);

I've opened the JS console in Chrome, Safari, and Firefox and typed these three commands with identical results:

let sock = new WebSocket('ws://localhost:8080');
console.log(sock.extensions);
sock.send("hello");

sock.extensions is empty, but the logging results on the server are:

--- connection ---
--- error ---
RangeError: Invalid WebSocket frame: RSV1 must be clear
    etc.

So I just want to clarify that all three of these browsers on Mac have a bad websocket implementation? Or am I missing something fundamental?

ws version

8.11.0

Node.js Version

v17.1.0

System

System: OS: macOS 13.0.1 CPU: (4) x64 Intel(R) Core(TM) i5-8210Y CPU @ 1.60GHz Memory: 50.64 MB / 8.00 GB Shell: 5.8.1 - /bin/zsh

Expected result

A message from a standard browser's WebSocket implementation would function properly

Actual result

Each browsers caused an error in the ws library

Attachments

No response

timschwab commented 1 year ago

Also here is Chrome's network tab if it is useful

image
timschwab commented 1 year ago

Alright, I tried this on Linux and it worked just fine. So it seems the problem is with MacOS, incredibly. I come to this conclusion because 1) all Mac browsers are affected (Chromium, Safari, and Firefox), and 2) I found a little bit of discussion of this in Apple forums.

My computer is pretty up to date, but I see that it is a little bit behind now. I'm going to try updating and see if that resolves the issue.

lpinca commented 1 year ago

I can't reproduce.

$ npx envinfo --system

  System:
    OS: macOS 13.1
    CPU: (16) x64 Intel(R) Xeon(R) W-2140B CPU @ 3.20GHz
    Memory: 21.54 GB / 32.00 GB
    Shell: 5.2.15 - /usr/local/bin/bash
const { WebSocketServer } = require('ws');
const http = require('http');

const data = `<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
  </head>
  <body>
    <script>
      (function () {
        var ws = new WebSocket('ws://localhost:8080');

        ws.onopen = function () {
          ws.send('hello');
        };
      })();
    </script>
  </body>
</html>`;

const server = http.createServer();

const wss = new WebSocketServer({ server });

server.on('request', function (req, res) {
  res.setHeader('Content-Type', 'text/html');
  res.end(data);
});

wss.on('connection', function (ws) {
  ws.on('message', function (buf) {
    console.log(buf.toString());
  });

  ws.send('hello');
});

server.listen(8080, function () {
  console.log('Open http://127.0.0.1:8080 in the brower');
});

Tested on Chrome 108.0.5359.124 and Safari 16.2 (18614.3.7.1.5).

timschwab commented 1 year ago

Very strange, I can still reproduce with your code, after updating to MacOS 13.1 and checking with npx envinfo, and with the exact same Chrome build. So we have the same OS and browser. My node is a bit out of date, so let me update that and see if that somehow helps. If not, I'm not sure what the issue could possibly be.

timschwab commented 1 year ago

Tried with Node v18.12.1, with the same problem still. So our OS, browser, node, and ws versions are all identical, and yet we have different behavior.

timschwab commented 1 year ago

Okay, new discovery. I ran my original code in a remote server, hit it using Chrome, and it all worked properly. So it appears that the problem only happens when I am using localhost. Don't know why my localhost is bad yet.

lpinca commented 1 year ago

I guess it is some software installed on you Mac. Anyway you can use wireshark or something like that to verify if the RSV1 bit is set. The server is started without permessage-deflate so the client should not set that bit. If the server receives a frame with that bit set, the issue on the client.

le4onardo commented 1 year ago

Hello there, I just came across the same issue on localhost Mac Monterrey v12.4 @timschwab Did you find any solution?

timschwab commented 1 year ago

@le4onardo Unfortunately no, I have not. I haven't been able to look into it too deeply - I have just been using a live website for my project, to avoid localhost altogether. Would love to hear about any resolution you may find though, and I might have time in the next few days to look into it myself as well.

lpinca commented 1 year ago

I'm closing this. Discussion con continue in the closed thread if needed.

timschwab commented 1 year ago

Alright, I was on a plane and gave debugging this another shot. I found something pretty interesting.

To see more clearly what is happening, I decided to just inject some code in the ws library. I injected this logic in the startLoop() function of receiver.js, at the beginning of the do:

const buf = this._buffers[0];
let str = "";
for (let i = 0 ; i < buf.length ; i++) {
  str += String.fromCharCode(buf[i]);
}
console.log(buf);
console.log(str);

So if I understand the code correctly, this will print out the raw bytes of every message the websocketserver receives post-upgrade.

Super strangely, this is what I get on the first message sent:

GET / HTTP/1.1
Host: localhost:8080
Connection: Upgrade
Pragma: no-cache
Cache-Control: no-cache
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36
Upgrade: websocket
Origin: http://localhost:8357
Sec-WebSocket-Version: 13
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Sec-WebSocket-Key: iZWLPmgk5K+1lEzgJr8G0Q==
Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits

... and then some binary data.

As is obvious, this isn't the WS message binary - this is the initial HTTP request! This is why it says RSV1 must be clear. The first two bytes aren't what they are supposed to be, they are GE.


After that, I tried something else. I found that the number of bytes of the GET request is 497. So, at the beginning of startLoop(), before it enters the do, I inject this:

this.consume(497);

And amazingly, the initial request actually worked now.


I tried sending two messages back to back, and yet another interesting thing happened in my logs:

On the first message log, I can see the GET request and the first message. On the second message log, I can see the GET request, the first message, AND the second message.

I added a few more messages, and it started indeterminately failing at some point after the first message.


So what it seems like is happening is that somehow, the buffer that the receiver is reading from doesn't flush after being read or something? So it reads from the beginning of the sockets history multiple times.

This sounds like it could be any one of these things:

Do you have any suggestions on how I could continue investigating this further? Thanks so much!

lpinca commented 1 year ago

Do you have any suggestions on how I could continue investigating this further?

Use tcpdump or wireshark and see the actual data that is sent by the client.

timschwab commented 1 year ago

Okay, finally did that. I used tcpdump to capture this browser code:

let sock = new WebSocket("ws://localhost:8080");
sock.onopen = function() {
    sock.send('"test"');
};

So, simply opening the connection and sending a single message. The results of the tcpdump:

sh-3.2# tcpdump -i lo0 port 8080
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on lo0, link-type NULL (BSD loopback), snapshot length 524288 bytes
16:04:03.064671 IP6 localhost.56285 > localhost.http-alt: Flags [S], seq 177896051, win 65535, options [mss 16324,nop,wscale 6,nop,nop,TS val 3213282305 ecr 0,sackOK,eol], length 0
16:04:03.064821 IP6 localhost.http-alt > localhost.56285: Flags [S.], seq 1374648243, ack 177896052, win 65535, options [mss 16324,nop,wscale 6,nop,nop,TS val 1258666383 ecr 3213282305,sackOK,eol], length 0
16:04:03.064839 IP6 localhost.56285 > localhost.http-alt: Flags [.], ack 1, win 6371, options [nop,nop,TS val 3213282305 ecr 1258666383], length 0
16:04:03.064853 IP6 localhost.http-alt > localhost.56285: Flags [.], ack 1, win 6371, options [nop,nop,TS val 1258666383 ecr 3213282305], length 0
16:04:03.070828 IP6 localhost.56285 > localhost.http-alt: Flags [P.], seq 1:498, ack 1, win 6371, options [nop,nop,TS val 3213282312 ecr 1258666383], length 497: HTTP: GET / HTTP/1.1
16:04:03.070867 IP6 localhost.http-alt > localhost.56285: Flags [.], ack 498, win 6364, options [nop,nop,TS val 1258666390 ecr 3213282312], length 0
16:04:03.075263 IP6 localhost.http-alt > localhost.56285: Flags [P.], seq 1:130, ack 498, win 6364, options [nop,nop,TS val 1258666395 ecr 3213282312], length 129: HTTP: HTTP/1.1 101 Switching Protocols
16:04:03.075293 IP6 localhost.56285 > localhost.http-alt: Flags [.], ack 130, win 6369, options [nop,nop,TS val 3213282317 ecr 1258666395], length 0
16:04:03.114177 IP6 localhost.56285 > localhost.http-alt: Flags [P.], seq 498:995, ack 130, win 6369, options [nop,nop,TS val 3213282356 ecr 1258666395], length 497: HTTP: GET / HTTP/1.1
16:04:03.114210 IP6 localhost.http-alt > localhost.56285: Flags [.], ack 995, win 6356, options [nop,nop,TS val 1258666434 ecr 3213282356], length 0
16:04:03.114227 IP6 localhost.56285 > localhost.http-alt: Flags [P.], seq 995:1007, ack 130, win 6369, options [nop,nop,TS val 3213282356 ecr 1258666434], length 12: HTTP
16:04:03.114231 IP6 localhost.http-alt > localhost.56285: Flags [.], ack 1007, win 6356, options [nop,nop,TS val 1258666434 ecr 3213282356], length 0

So, this seems pretty definitive that the client is somehow sending the GET request again with the first WebSocket message.

timschwab commented 1 year ago

Filed a Chromium bug here: https://bugs.chromium.org/p/chromium/issues/detail?id=1421590

Perhaps they will be able to reproduce, and dig into it further.

mattvb91 commented 1 year ago

@timschwab any update on this?

I am currently seeing this exact behaviour on chrome client but also on a node client? so its not related to chromium in my case.

timschwab commented 1 year ago

@mattvb91 Hey, I did eventually find a solution - use a port other than 8080.

I still don't know why that port did not work, while others did. My theory is that Mac Docker was doing something weird with it that caused it to exhibit the behavior I found above, but once I found I could simply switch ports, I stopped investigating.

timschwab commented 1 year ago

Also just to close the loop, my post on the Apple dev forum is here: https://developer.apple.com/forums/thread/726731

mattvb91 commented 1 year ago

thanks @timschwab for the response. In my case it seems to be with the express library. As soon as I went back to native http.createServer and the websocket server its working fine.

lvonk commented 8 months ago

Thanks @timschwab for pointing us in the right direction. Just as a reference for others you are banging their heads on this issue. We had a similar issue and it turned out to be the Parental Control was enabled on the laptop which hijacked port 8080 (https://discussions.apple.com/thread/250500940). Conclusion: don't use 8080 on mac osx...