slackapi / java-slack-sdk

Slack Developer Kit (including Bolt for Java) for any JVM language
https://slack.dev/java-slack-sdk/
MIT License
571 stars 212 forks source link

Repeated websocket connection resets from Slack's API, causing slack application to create multiple concurrent sessions #1206

Closed sonalranjit closed 10 months ago

sonalranjit commented 12 months ago

Our slack application periodically spams the channels it is in with repeated retries of commands it has been issued. Further inspection on logs shows it is repeatedly creating new sessions forever until the application is restarted manually and a stable websocket connection is formed.

Reproducible in:

The Slack SDK version

+--- com.slack.api:slack-app-backend:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
+--- com.slack.api:slack-api-client:1.29.2 (*)
+--- com.slack.api:slack-api-model:1.29.2 (*)
+--- com.slack.api:bolt:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2 (*)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2 (*)
+--- com.slack.api:slack-app-backend:1.29.2 (n)
+--- com.slack.api:slack-api-client:1.29.2 (n)
+--- com.slack.api:slack-api-model:1.29.2 (n)
\--- com.slack.api:bolt:1.29.2 (n)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    |    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2
|         +--- com.slack.api:slack-api-model:1.29.2 (*)
|         +--- com.slack.api:slack-api-client:1.29.2 (*)
|         +--- com.slack.api:slack-app-backend:1.29.2 (*)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    |    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2
|         +--- com.slack.api:slack-api-model:1.29.2 (*)
|         +--- com.slack.api:slack-api-client:1.29.2 (*)
|         +--- com.slack.api:slack-app-backend:1.29.2 (*)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    |    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2
|         +--- com.slack.api:slack-api-model:1.29.2 (*)
|         +--- com.slack.api:slack-api-client:1.29.2 (*)
|         +--- com.slack.api:slack-app-backend:1.29.2 (*)
+--- com.slack.api:bolt-socket-mode:1.29.2 (n)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    |    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2
|         +--- com.slack.api:slack-api-model:1.29.2 (*)
|         +--- com.slack.api:slack-api-client:1.29.2 (*)
|         +--- com.slack.api:slack-app-backend:1.29.2 (*)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    |    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2
|         +--- com.slack.api:slack-api-model:1.29.2 (*)
|         +--- com.slack.api:slack-api-client:1.29.2 (*)
|         +--- com.slack.api:slack-app-backend:1.29.2 (*)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    |    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2
|         +--- com.slack.api:slack-api-model:1.29.2 (*)
|         +--- com.slack.api:slack-api-client:1.29.2 (*)
|         +--- com.slack.api:slack-app-backend:1.29.2 (*)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    |    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2
|         +--- com.slack.api:slack-api-model:1.29.2 (*)
|         +--- com.slack.api:slack-api-client:1.29.2 (*)
|         +--- com.slack.api:slack-app-backend:1.29.2 (*)
+--- com.slack.api:bolt-socket-mode:1.29.2
|    +--- com.slack.api:slack-api-model:1.29.2
|    +--- com.slack.api:slack-api-client:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    +--- com.slack.api:slack-app-backend:1.29.2
|    |    +--- com.slack.api:slack-api-model:1.29.2 (*)
|    |    +--- com.slack.api:slack-api-client:1.29.2 (*)
|    \--- com.slack.api:bolt:1.29.2
|         +--- com.slack.api:slack-api-model:1.29.2 (*)
|         +--- com.slack.api:slack-api-client:1.29.2 (*)
|         +--- com.slack.api:slack-app-backend:1.29.2 (*)

Java Runtime version

openjdk version "11.0.18" 2023-01-17
OpenJDK Runtime Environment (build 11.0.18+10-post-Debian-1deb11u1)
OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Debian-1deb11u1, mixed mode, sharing)

OS info

#1 SMP Debian 5.10.179-2 (2023-07-14)

Steps to reproduce:

Expected result:

The application should gracefully handle connection resets from Slack's API and recreate and maintain a stable websocket connection.

Actual result:

The application repeatedly creates new sessions:

Sep 14 02:49:46 bot bot[159562]: [WebSocketConnectReadThread-7821] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:46 bot bot[159562]: [WebSocketConnectReadThread-8071] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:47 bot bot[159562]: [WebSocketConnectReadThread-8071] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:47 bot bot[159562]: [WebSocketConnectReadThread-8074] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:48 bot bot[159562]: [WebSocketConnectReadThread-8074] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:48 bot bot[159562]: [WebSocketConnectReadThread-8077] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:48 bot bot[159562]: [WebSocketConnectReadThread-8077] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:48 bot bot[159562]: [WebSocketConnectReadThread-8080] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:49 bot bot[159562]: [WebSocketConnectReadThread-8080] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:49 bot bot[159562]: [WebSocketConnectReadThread-8083] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:50 bot bot[159562]: [WebSocketConnectReadThread-8083] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:50 bot bot[159562]: [WebSocketConnectReadThread-8086] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:50 bot bot[159562]: [WebSocketConnectReadThread-8086] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:51 bot bot[159562]: [WebSocketConnectReadThread-8089] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:51 bot bot[159562]: [WebSocketConnectReadThread-8089] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:51 bot bot[159562]: [WebSocketConnectReadThread-8092] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:52 bot bot[159562]: [WebSocketConnectReadThread-8092] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:52 bot bot[159562]: [WebSocketConnectReadThread-8095] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:52 bot bot[159562]: [WebSocketConnectReadThread-8095] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:53 bot bot[159562]: [WebSocketConnectReadThread-8098] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: -1, reason: )
Sep 14 02:49:53 bot bot[159562]: [WebSocketConnectReadThread-8100] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:53 bot bot[159562]: [WebSocketConnectReadThread-8100] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:53 bot bot[159562]: [WebSocketConnectReadThread-8103] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:54 bot bot[159562]: [WebSocketConnectReadThread-8103] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:54 bot bot[159562]: [WebSocketConnectReadThread-8106] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:55 bot bot[159562]: [WebSocketConnectReadThread-8106] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:55 bot bot[159562]: [WebSocketConnectReadThread-8109] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:55 bot bot[159562]: [WebSocketConnectReadThread-8109] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:55 bot bot[159562]: [WebSocketConnectReadThread-8112] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:56 bot bot[159562]: [WebSocketConnectReadThread-8112] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:56 bot bot[159562]: [WebSocketConnectReadThread-8115] INFO com.slack.api.socket_mode.SocketModeClient - New session is open
Sep 14 02:49:57 bot bot[159562]: [WebSocketConnectReadThread-8115] INFO com.slack.api.socket_mode.SocketModeClient - onClose listener is called (code: 1000, reason: )
Sep 14 02:49:57 bot bot[159562]: [socket-mode-session-monitor-worker-36] INFO com.slack.api.socket_mode.SocketModeClient - Failed to send a ping message (exception: org.java_websocket.exceptions.WebsocketNotConnectedException, message: null)
Sep 14 02:49:57 bot bot[159562]: [WebSocketConnectReadThread-8118] INFO com.slack.api.socket_mode.SocketModeClient - New session is open

Further inspection of the network traffic in our environment shows the connection being severed from Slack's API after a payload of 3MB is sent from the slack server side.

Is there any mitigation for scenarios like this ? It's causing the application to spam the channels it is in perhaps because the connection is being severed continually and the slack's api is retrying and the case here is new sessions are being created as well repeatedly so the retries go on and on.

seratch commented 12 months ago

Hi @sonalranjit, thank you for writing in.

Have you experienced the same issue in a different network setting, such as on your local machine? If not, this problem may be specific to the infra where you're running the app. We've never received this type of error report in the last few years since this project started. Therefore, I'm wondering if there might be a specific reason causing the situation you're experiencing.

Additionally, if switching from JavaWebsocket to the default option can resolve the issue, it could be related to the underlying WebSocket client library.

sonalranjit commented 12 months ago

@seratch It's difficult to reproduce this on a local machine because, we only encounter this when the application runs for a long period of time. It happens randomly, for months it will have no issues and then suddenly an event like this would occur. Our infra team notice just the connection resets.

I can switch the backend websocket client library to the tyrus standalone and give it a try.

github-actions[bot] commented 10 months ago

👋 It looks like this issue has been open for 30 days with no activity. We'll mark this as stale for now, and wait 10 days for an update or for further comment before closing this issue out. If you think this issue needs to be prioritized, please comment to get the thread going again! Maintainers also review issues marked as stale on a regular basis and comment or adjust status if the issue needs to be reprioritized.

sonalranjit commented 10 months ago

Switching to tyrus websocket client, we haven't encountered the issue so far the app has been running stably since the switch last over last month.