slackapi / node-slack-sdk

Slack Developer Kit for Node.js
https://slack.dev/node-slack-sdk
MIT License
3.27k stars 662 forks source link

Unhandled event 'websocket open' in state 'connected'. #1771

Closed gaurav512 closed 5 months ago

gaurav512 commented 6 months ago

Description

There were a couple of random crashes on my slack bot in the past few days. Adding the traces below (both are similar, but not identical). Not really reproducible.

Trace 1:

/app/node_modules/finity/lib/core/StateMachine.js:76    
      throw new Error('Unhandled event \'' + event + '\' in state \'' + this.currentState + '\'.'); 
      ^ 

Error: Unhandled event 'websocket open' in state 'connected'.   
    at StateMachine.handleUnhandledEvent (/app/node_modules/finity/lib/core/StateMachine.js:76:13)  
    at /app/node_modules/finity/lib/core/HierarchicalStateMachine.js:79:33  
    at TaskScheduler.execute (/app/node_modules/finity/lib/core/TaskScheduler.js:29:7)  
    at TaskScheduler.enqueue (/app/node_modules/finity/lib/core/TaskScheduler.js:19:12) 
    at HierarchicalStateMachine.handle (/app/node_modules/finity/lib/core/HierarchicalStateMachine.js:72:24)    
    at WebSocket.<anonymous> (/app/node_modules/@slack/socket-mode/dist/SocketModeClient.js:418:31) 
    at WebSocket.onOpen (/app/node_modules/ws/lib/event-target.js:144:16)   
    at WebSocket.emit (node:events:513:28)  
    at WebSocket.setSocket (/app/node_modules/ws/lib/websocket.js:225:10)   
  at ClientRequest.<anonymous> (/app/node_modules/ws/lib/websocket.js:882:15)   
    [20:42:58.960 INFO]  Now connected to Slack 
    [20:42:58.608 INFO]  Reconnecting to Slack ...  
    [20:42:58.608 INFO]  Going to establish a new connection to Slack ...   
    [20:42:57.738 INFO]  Reconnecting to Slack ...  
    [20:42:57.738 INFO]  Going to establish a new connection to Slack ...   
    [20:42:53.610 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [20:42:52.818 INFO]  Reconnecting to Slack ...  
    [20:42:52.818 INFO]  Going to establish a new connection to Slack ...   
    [20:42:52.736 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [20:42:51.921 INFO]  Reconnecting to Slack ...  
    [20:42:51.921 INFO]  Going to establish a new connection to Slack ...   
    [20:42:47.816 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [20:42:46.993 INFO]  Reconnecting to Slack ...  
    [20:42:46.993 INFO]  Going to establish a new connection to Slack ...   
    [20:42:46.918 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [20:42:46.079 INFO]  Reconnecting to Slack ...  
    [20:42:46.079 INFO]  Going to establish a new connection to Slack ...   
    [20:42:41.992 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [20:42:41.076 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [20:42:41.032 INFO]  Reconnecting to Slack ...  
    [20:42:41.032 INFO]  Going to establish a new connection to Slack ...   
    [20:42:40.228 INFO]  Reconnecting to Slack ...  
    [20:42:40.228 INFO]  Going to establish a new connection to Slack ...   
    [20:42:36.028 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [20:42:35.144 INFO]  Reconnecting to Slack ...  
    [20:42:35.145 INFO]  Going to establish a new connection to Slack ...

Trace 2:

  /app/node_modules/finity/lib/core/StateMachine.js:76  
          throw new Error('Unhandled event \'' + event + '\' in state \'' + this.currentState + '\'.'); 
                ^   

    Error: Unhandled event 'server disconnect old socket' in state 'connecting'.    
        at StateMachine.handleUnhandledEvent (/app/node_modules/finity/lib/core/StateMachine.js:76:13)  
        at /app/node_modules/finity/lib/core/HierarchicalStateMachine.js:79:33  
        at TaskScheduler.execute (/app/node_modules/finity/lib/core/TaskScheduler.js:29:7)  
        at TaskScheduler.enqueue (/app/node_modules/finity/lib/core/TaskScheduler.js:19:12) 
        at HierarchicalStateMachine.handle (/app/node_modules/finity/lib/core/HierarchicalStateMachine.js:72:24)    
        at SocketModeClient.onWebSocketMessage (/app/node_modules/@slack/socket-mode/dist/SocketModeClient.js:609:31)   
        at WebSocket.onMessage (/app/node_modules/ws/lib/event-target.js:132:16)    
        at WebSocket.emit (node:events:513:28)  
        at Receiver.receiverOnMessage (/app/node_modules/ws/lib/websocket.js:1068:20)   
        at Receiver.emit (node:events:513:28)   
    [09:34:54.388 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [09:34:53.568 INFO]  Reconnecting to Slack ...  
    [09:34:53.568 INFO]  Going to establish a new connection to Slack ...   
    [09:34:43.560 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [09:34:42.744 INFO]  Reconnecting to Slack ...  
    [09:34:42.745 INFO]  Going to establish a new connection to Slack ...   
    [09:34:32.745 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [09:34:31.971 INFO]  Going to establish a new connection to Slack ...   
    [09:34:31.970 INFO]  Reconnecting to Slack ...  
    [09:34:21.970 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [09:34:21.105 INFO]  Going to establish a new connection to Slack ...   
    [09:34:21.105 INFO]  Reconnecting to Slack ...  
    [09:34:11.100 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [09:34:10.200 INFO]  Reconnecting to Slack ...  
    [09:34:10.200 INFO]  Going to establish a new connection to Slack ...   
    [09:34:00.195 ERROR]  An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event'))   
    [09:33:59.386 INFO]  Reconnecting to Slack ...  
    [09:33:59.387 INFO]  Going to establish a new connection to Slack ...

Packages:

Select all that apply:

Reproducible in:

The Slack SDK version

"slack/bolt": "^3.13.3" "slack/logger": "^3.0.0" "slack/web-api": "^6.9.1"

Node.js runtime version

v16.8.1

OS info

Red Hat Enterprise Linux 8

Steps to reproduce:

The app was running since a long time, and these errors led it to crash (twice in the past few days)

Expected result:

The app should've worked fine.

Actual result:

The app crashed.

Requirements

For general questions/issues about Slack API platform or its server-side, could you submit questions at https://my.slack.com/help/requests/new instead. :bow:

Please read the Contributing guidelines and Code of Conduct before creating this issue or pull request. By submitting, you are agreeing to those rules.

filmaj commented 6 months ago

Thanks for the report, I believe this may occur during the brief window when the app is attempting to reconnect, it will have two simultaneous websocket connections open, which may confuse the state machine, as the state machine assumes a single connection exists.

I will mark this as a bug and will try to reproduce the scenario using the integration tests.

gaurav512 commented 6 months ago

Any clue why this error: An unhandled error occurred while Bolt processed (type: undefined, error: TypeError: Cannot read properties of undefined (reading 'event')) is happening so frequently?

It doesn't seem to directly affect anything (and I guess it only happens when there are no incoming requests), but still curious.

filmaj commented 6 months ago

What version of the @slack/socket-mode package are you using?

Also, if it is possible to turn on debug logging for your app (it looks like you are using bolt? If so: instructions on how to do so here) and post debug logs, that would be helpful too.

I am not sure on the exact cause of the logs, it may also depend on any additional code you may have integrated in your bolt app. Can you post your App initialization (and if you are using a custom receiver, the initialization of the receiver as well).

jwoo0122 commented 6 months ago

I'm also seeing this trace recently. There're so many reconnection trial, and after that it always said Unhandled event 'server disconnect old socket' in state 'connecting'., and process crashed. I use @slack/bolt@3.13.0 without any custom receiver.

const app = new App({
    signingSecret: process.env.SECRET,
    token: process.env.TOKEN,
    socketMode: true,
    appToken: process.env.APP_TOKEN,
    port: Number(process.env.PORT || 3000),
    logLevel: LogLevel.INFO,
  });

app.start();
gaurav512 commented 5 months ago

What version of the @slack/socket-mode package are you using?

I use @slack/web-api@6.9.1 and it installs @slack/socket-mode@1.3.2.

This is my app initialization:

const app = new App({
  port: parseInt(process.env.PORT || '', 10) || 6543,
  socketMode: true,
  token: slackSecrets?.slackBotToken,
  appToken: slackSecrets?.slackAppToken,
  signingSecret: slackSecrets?.slackSigningSecret,
  logLevel: appLogLevel,
  logger,
  customRoutes: [
    ...getActuatorEndpoints(),
  ],
  extendedErrorHandler: true,
});

customRoutes has some health check related endpoints. Noting that I do de-reference the slack secrets once the app is initialized. Let me know if this might be a problem.

filmaj commented 5 months ago

Noting that I do de-reference the slack secrets once the app is initialized. Let me know if this might be a problem.

That should be fine. If there are easily detectable problems with the secrets (like missing or empty values), it would be raised upon initialization.

We've had more reports from folks about problems with reconnecting and this package recently. In my investigations, it is partly due to the very old implementation in the node socket-mode package (based on a several-years-old version of the old @slack/rtm-api module).

We are considering migrating the python-slack-sdk implementation/design to node and releasing it as the next major version.

filmaj commented 5 months ago

An RC was released to try to address this:

filmaj commented 5 months ago

The fix has now been released to mainline in 1.3.5 and is out of release candidacy. It will also be present in the soon-to-come 2.0 release of @slack/socket-mode.

spencer-melo-lp commented 4 months ago

Is this fixed and available on @slack/bolt >= 3.18.0 ? I'm facing the exact same error trying to start the application.

/Users/***/***/***/node_modules/finity/lib/core/StateMachine.js:76
      throw new Error('Unhandled event \'' + event + '\' in state \'' + this.currentState + '\'.');
            ^
Error: Unhandled event 'server explicit disconnect' in state 'connecting'.
    at StateMachine.handleUnhandledEvent (/Users/***/***/***t/node_modules/finity/lib/core/StateMachine.js:76:13)
filmaj commented 4 months ago

The class of bugs "unhandled event X in state Y" is a difficult, intermittent bug in socket-mode 1.x. The underlying issue is in the design of socket-mode 1.x: there is a single state machine for the entire module, but the module sometimes can briefly use more than one WebSocket connection (specifically during reconnections). These multiple WS connections pump state transitions based on WS connection state into the single state machine. This is not an issue when only one WS connection is active, but sometimes during or around reconnection, when two WS connections are briefly active, they can send their own connection event state transitions into the single state machine, which can be contradictory and lead to this issue.

Socket-mode 2.0 uses a completely new design that is not susceptible to this class of bugs.

@slack/bolt does not (yet) use socket-mode 2 (mainly because socket-mode 2 is a breaking change). However, I do have a release candidate for bolt v4 up on npm, where you can give socket-mode 2 a try. I believe it should address this issue. https://www.npmjs.com/package/@slack/bolt/v/4.0.0-rc.1

FYI @spencer-melo-lp