websockets / ws

Simple to use, blazing fast and thoroughly tested WebSocket client and server for Node.js
MIT License
21.33k stars 2.3k forks source link

Detecting broken connections fails with `RSV1 must be clear` on node v20.11.0 #2193

Closed mStirner closed 5 months ago

mStirner commented 6 months ago

Is there an existing issue for this?

Description

When the "How to detect and close broken connections?" code from the FAQ is used, the client disconnects after ~1:20 (1 Minute, 20 seconds).

Minimal reproducible example: index.js

const WebSocket = require("ws");
const http = require("http");
const express = require("express");
const app = express();

let wss = new WebSocket.Server({
    noServer: true
});

wss.on("connection", (ws) => {

    console.log("Client open");

    let interval = setInterval(() => {
        ws.send(`${Date.now()}`);
    }, 3000);

    ws.once("close", () => {
        console.log("Client close")
        clearInterval(interval);
    });

});

// PROBLEM START HERE
let interval = setInterval(() => {
    wss.clients.forEach((ws) => {

        if (!ws.isAlive) {
            ws.terminate();
            return;
        }

        ws.isAlive = false;
        ws.ping();

    });
}, 5000);

wss.on("close", () => {
    clearInterval(interval);
});
// PROBLEM ENDS HERE

app.get("/", (req, res) => {
    if ((!req.headers["upgrade"] || !req.headers["connection"])) {

        // no websocket handshake
        res.status(421).end();

    } else {

        wss.handleUpgrade(req, req.socket, req.headers, (ws) => {

            ws.isAlive = true;

            ws.on("pong", () => {
                ws.isAlive = true;
            });

            wss.emit("connection", ws, req);

        });

    }
});

const server = http.createServer(app);

server.listen(8123, "0.0.0.0", (err) => {
    console.log(err || "Server listening on http://0.0.0.0:8123");
});

['SIGINT', 'SIGQUIT', 'SIGTERM'].forEach((signal) => {
    process.on(signal, () => {

        wss.clients.forEach((ws) => {
            ws.terminate();
        });

        server.close(() => {
            console.log("HTTP Server closed");
            process.exit(0);
        });

    });
});

package.json

{
  "name": "test-ws-timeout",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1",
    "docker": "docker run --rm -it --network=host -v ./:/app node:20.11.0-bullseye node /app/index.js"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "express": "^4.18.2",
    "ws": "8.10"
  }
}

When the following code is commented, nothing happens and it works like expected:

let interval = setInterval(() => {
    wss.clients.forEach((ws) => {

        if (!ws.isAlive) {
            ws.terminate();
            return;
        }

        ws.isAlive = false;
        ws.ping();

    });
}, 5000);

wss.on("close", () => {
    clearInterval(interval);
});

I only noticed because i installed a new OS and node.js version. On node v16.20.2 nothing happens, and the broken connection detection works.

ws version

8.10

Node.js Version

20.11.0

System

System: OS: Linux 5.4 Ubuntu 18.04.6 LTS (Bionic Beaver) CPU: (8) x64 Intel(R) Core(TM) i5-8365U CPU @ 1.60GHz Memory: 7.18 GB / 15.45 GB Container: Yes Shell: 4.4.20 - /bin/bash

Expected result

The client should work and do not exit with the error. Since node 20.11.0 does not work on my OS, i used docker to test it. But it also happens on Ubuntu Unity 23.10 with node 20.11.0.

Test with the client that this package provides and wscat --connect=127.0.0.1:8123 (v5.1.0)

Actual result

No response

Attachments

test-ws-timeout.zip

mStirner commented 6 months ago

Same happens with module version v8.10 on node 20.11.0

When multiple connections are made to the server, they do not drop all at the same time, but more after they are done. E.g:

lpinca commented 5 months ago

The error you are seeing happens when a client sends an invalid frame. See https://github.com/websockets/ws/issues/2169. Also this snippet

app.get('/', (req, res) => {
  if (!req.headers['upgrade'] || !req.headers['connection']) {
    // no websocket handshake
    res.status(421).end();
  } else {
    wss.handleUpgrade(req, req.socket, req.headers, (ws) => {
      ws.isAlive = true;

      ws.on('pong', () => {
        ws.isAlive = true;
      });

      wss.emit('connection', ws, req);
    });
  }
});

is wrong. app.get() does not handle the 'upgrade' event of the server. It handles the 'request' event. This means that you are calling wss.handleUpgrade() with the socket of a normal (non-upgrade) request which is probably why you see that error.

lpinca commented 5 months ago

On second look, it seems that the else branch is never taken. I think wss.handleUpgrade() is never called with the above code.

mStirner commented 5 months ago

Thanks for the response. But what i do not understand, why this just happens after a update from node 16 to 20. In v16 everything works fine and like expected. Why is this not the case after a update to v20?

he error you are seeing happens when a client sends an invalid frame

This happens the other way around. Seems like the server sends a invalid frame. Without this snippet, it works:

// PROBLEM START HERE
let interval = setInterval(() => {
    wss.clients.forEach((ws) => {

        if (!ws.isAlive) {
            ws.terminate();
            return;
        }

        ws.isAlive = false;
        ws.ping();

    });
}, 5000);

wss.on("close", () => {
    clearInterval(interval);
});
// PROBLEM ENDS HERE

For the upgrade handling part, i never had any issues doing it this way. req.socket socket is the underlying net.Socket used for the http request, and nothing else is send over the tcp socket but the websocket. And it worked like a charm in v16.

I used the approach with express because i have multiple websocket endpoints and like the url/http request handling that express provides.

Regardless how the upgrade is done, why does it (not) work when the broken connection detection part is (un-)commented? The error only happens when ping events are send and node =<16. At nearly exact after ~80 seconds.

Makes no sense to me. I dont want to bother you, just try to understand it better.

mStirner commented 5 months ago

On second look, it seems that the else branch is never taken. I think wss.handleUpgrade() is never called with the above code.

What do you mean? The websocket handshake is done, and wscat does output the periodically send timestamp.

You can run it on your own. Its a minimal reproducible example.

mStirner commented 5 months ago

You are right with the part how i handle the websocket upgrade. When i listen for the upgrade event, it works too in newer node versions.

What i find interesting is, that, when i add to my existing code (found here: https://github.com/expressjs/express/issues/2594#issuecomment-228970643):

server.on("upgrade", (req, socket, head) => {

    let res = new http.ServerResponse(req);
    res.assignSocket(socket)

    res.on("finish", () => {
        res.socket.destroy();
    });

    app(req, res);

});

Without changing anything else, the problem is resolved and it works. What fascinates me, is the fact, that i use in the express route handler still:

        wss.handleUpgrade(req, req.socket, req.headers, (ws) => {

            ws.isAlive = true;

            ws.on("pong", () => {
                ws.isAlive = true;
            });

            wss.emit("connection", ws, req);

        });

What/Why is req.socket something different if handled by "upgrade" or "request" event. They refer to the same underlying tcp/net.Socket, even in the docs they say that. And as said in earlier comments, it worked till node version 18/20.

lpinca commented 5 months ago

What/Why is req.socket something different if handled by "upgrade" or "request" event.

Because the socket is freed when the 'upgrade' event is emitted. The 'request' handler keeps all the HTTP machinery. In older versions of Node.js the 'request' event was not emitted for upgrade requests. It seems that now it is emitted when there is no 'upgrade' listener.

mStirner commented 5 months ago

Do you see any downsites how i handle the upgrade/routing stuff this way?

lpinca commented 5 months ago

Keep all the WebSocket stuff in the 'upgrade' handler. Once you add it the express route handler will no longer be used for upgrade requests.

lpinca commented 5 months ago

The error you are seeing is probably caused by some spurious HTTP stuff written to the raw socket used by the WebSocket.

mStirner commented 5 months ago

Makes somehow sense, but why "exactly" after ~1:20 and only when the ping events from How to detect and close broken connections? are used?

lpinca commented 5 months ago

Maybe your issue started after this change https://github.com/nodejs/node/pull/43522.

lpinca commented 5 months ago

Makes somehow sense, but why "exactly" after ~1:20 and only when the ping events from ...

I don't know, probably because there are other open sockets. ws.terminate() destroys the socket, but there might be other sockets open for non upgrade requests.

mStirner commented 5 months ago

Thank you for helping. I close this issue. its resolved for me and not a problem of the ws package :)