socketio / socket.io

Realtime application framework (Node.JS server)
https://socket.io
MIT License
61.27k stars 10.12k forks source link

socket.io-client 4.8.0 automatic reconnect is not working #5197

Open sruetzler opened 2 months ago

sruetzler commented 2 months ago

Describe the bug I have a client which connects to server If the server stops and restarts the connect event is not called anymore

In version 4.7.5 the connect event was called when the server restarts and opens the websocket

Socket.IO client version: 4.8.0

Expected behavior The connect event should be called automatically if the server restarts and opens the websocket

Platform: NodeJs 16 on Ubuntu 20.04 and also in Chromium 128.0.6613.119

Additional context On 4.7.5 I get multiple times this connect_error event until it reconnects.

TransportError: xhr poll error      
    at Polling.onError (/home/sruetzler/data/workspace/gitlab/target/installer/rauc-api-client/node_modules/engine.io-client/build/cjs/transport.js:47:37)                                                         
    at Request.<anonymous> (/home/sruetzler/data/workspace/gitlab/target/installer/rauc-api-client/node_modules/engine.io-client/build/cjs/transports/polling.js:238:18)                                           
    at Request.Emitter.emit (/home/sruetzler/data/workspace/gitlab/target/installer/rauc-api-client/node_modules/@socket.io/component-emitter/lib/cjs/index.js:143:20)
    at Request.onError (/home/sruetzler/data/workspace/gitlab/target/installer/rauc-api-client/node_modules/engine.io-client/build/cjs/transports/polling.js:343:14)
    at Timeout._onTimeout (/home/sruetzler/data/workspace/gitlab/target/installer/rauc-api-client/node_modules/engine.io-client/build/cjs/transports/polling.js:316:30)
    at listOnTimeout (node:internal/timers:557:17)
    at processTimers (node:internal/timers:500:7) {
  description: 0,
  context: XMLHttpRequest {
    UNSENT: 0,
    OPENED: 1,
    HEADERS_RECEIVED: 2,
    LOADING: 3,
    DONE: 4,
    readyState: 4,
    onreadystatechange: [Function (anonymous)],
    responseText: 'Error: connect ECONNREFUSED 192.168.70.132:443\n' +
      '    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1161:16)',
    responseXML: '',
    status: 0,
    statusText: Error: connect ECONNREFUSED 192.168.70.132:443
        at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1161:16) {
      errno: -111,
      code: 'ECONNREFUSED',
      syscall: 'connect',
      address: '192.168.70.132',
      port: 443
    },
    open: [Function (anonymous)],
    setDisableHeaderCheck: [Function (anonymous)],
    setRequestHeader: [Function (anonymous)],
    getResponseHeader: [Function (anonymous)],
    getAllResponseHeaders: [Function (anonymous)],
    getRequestHeader: [Function (anonymous)],
    send: [Function (anonymous)],
    handleError: [Function (anonymous)],
    abort: [Function (anonymous)],
    addEventListener: [Function (anonymous)],
    removeEventListener: [Function (anonymous)],
    dispatchEvent: [Function (anonymous)]
  },
  type: 'TransportError'
}

on 4.8.0 I get this errror once and after that I get this endless

No transports available
FredrikAugust commented 1 month ago

We're also seeing that it doesn't automatically reconnect when the transport is broken using websockets.

isarikaya commented 1 month ago

Is it possible for you to create a minimal repo that simulates the issue? @sruetzler

jsilvawbc commented 1 month ago

Hi. Just commenting here so I get further notifications as this potentially affects our product.

In the mean time we locked socket.io related dependencies at ~4.7.5

And due to the new npm audit reports, we added overrides for cookie to version ^0.7.2

Thanks.

sruetzler commented 1 month ago

I tried to reproduce this in a short example. But until now could not reproduce it. At this time I don't know what is different in my code that it fails. Perhaps someone else could help. What about @jsilvawbc or @FredrikAugust ? Can you help? Do you know whats different or do you have a simple example code that can reproduce this problem?

FredrikAugust commented 1 month ago

@sruetzler

We initialise the io client like this;

        {
            auth: xxx,
            transports: ['websocket', 'polling'],
            withCredentials: true,
            reconnectionDelay: 100,
            reconnectionDelayMax: 1000,
            rememberUpgrade: true,
            closeOnBeforeunload: true
        };

And then observe that if you e.g. kill the backend server, it will simply not attempt to reconnect, even though it should based on the docs.

AndersRobstad commented 1 month ago

@sruetzler

We initialise the io client like this;

      {
          auth: xxx,
          transports: ['websocket', 'polling'],
          withCredentials: true,
          reconnectionDelay: 100,
          reconnectionDelayMax: 1000,
          rememberUpgrade: true,
          closeOnBeforeunload: true
      };

And then observe that if you e.g. kill the backend server, it will simply not attempt to reconnect, even though it should based on the docs.

this.client.on('connect_error', (error) => {
            // If this is active it indicates a transient issue and it will try to reconnect
            if (TypeUtils.isFalse(this.client?.active)) {
                this.logger.warning(`[SocketClient] Connection error occurred: ${error.message}`, {
                    error
                });
                this.client.io.connect();
                return;
            }

            this.logger.error(`[SocketClient] Transient connection error occurred: ${error.message}`, {
                error
            });
}

We based the conclusion that the socket should try to automatically reconnect since we had this listener function on the connect_error event. And even though it logged the error as transient, a reconnect attempt was never initialized.

None of the below listener function were ever triggered when the backend was killed and the above mentioned transient connection error log was triggered:

client.io.on('reconnect_attempt', () => {
    this.logger.info('[SocketClient] Initiating attempt to reconnect to socket...');
});

client.io.on('reconnect', (attempt) => {
    this.logger.info(`[SocketClient] Reconnected after ${attempt} attempts`);
});

client.io.on('reconnect_failed', () => {
    this.logger.error('[SocketClient] Reconnection failed after allotted attempts');
});

client.io.on('reconnect_error', (error) => {
    this.logger.error(`[SocketClient] Reconnection error: ${error.message}`, error);
});
wes337 commented 1 month ago

I am also facing this issue. It's pretty major since our "online users" counter just drops to 0 anytime we push updates to our code base. I think going back to 4.7.5 is the move for now

darrachequesne commented 1 month ago

Hi everyone, sorry for the delay.

I was not able to reproduce the issue:

"No transports available" suggests the transports array is empty, but I don't know how this could happen. This might be linked to this change.

darrachequesne commented 1 month ago

I would need some additional information.

Does it happen with the client bundle? Or with a bundler (webpack, rollup, ...)? In that case, could you please provide your configuration?

Does it happen randomly? Always?

Does it happen in all browsers?

Thanks in advance.

wes337 commented 4 weeks ago

Hey @darrachequesne

Here's how our frontend connects

export const socket = io(SOCKET_SERVER_URL, {
  autoConnect: false,
  transports: ["websocket"], // <--- Could it related be this?
  timeout: 10000,
  auth: (callback) => {
    const token = getToken()
    if (token ) {
      callback({
        token,
      });
    } else {
      console.log("No token found when trying to connect to Socket");
    }
  },
  parser: customParser, // <--- We're using socket.io-msgpack-parser here
});