Closed torgeirl closed 9 months ago
Are you using plain bot scope or the new granular scopes?
Are you using plain bot scope or the new granular scopes?
I'm using the old, plain bot scope. How would that effect the gracefulness of this error? :confused:
I need to understand your setup fully so that I can attempt to replicate the issue you're having.
I was able to reproduce the issue. I'll dig into the code to see what can be done to handle this case cleanly.
I've got a tentative solution here https://github.com/errbotio/errbot/pull/1463. Would you be able to test it on your setup @torgeirl ?
I've got a tentative solution here #1463. Would you be able to test it on your setup @torgeirl ?
@nzlosh: I tested it out, but it didn't change anything. :disappointed:
It does however look like the changes are made for the SlackRTM
backend, and not the regular Slack
backend? Both backends reference RTM streams, but this particular error occurs in errbot/backends/slack.py
. Would this solution also work for the regular Slack backend?
(It was my understanding that the SlackRTM
backend is beta so once we got the regular Slack
backend working again last month I switch back to it.)
You're quite right, I mistook the RTM stream errors to be related to the slack_rtm
backend. The issue is the same between both backends, but the code is not the same, so the changes I've made won't apply to slack
. I'll take a look at patching the slack
backend when I get some free time.
OK. Thanks for taking the time to look into this!
@torgeirl I also got this issue, because reaching the maximum duration of a RTM web socket connection (8 hours), slack close the RTM connection. https://api.slack.com/events/goodbye
Do you have any update for this one? Thanks
Would you mind trying this backend https://github.com/nzlosh/err-backend-slackv3 and letting me know if you encounter the problem.
I using errbotv5.2 and slackclientv1.3.1, because our slack is huge, so rtm.start is naturally more difficult to use with Enterprise Grid and other large workspaces.
We often got the rtm_read error, after troubleshooting, we find out the bottleneck is CPU, old instance type is t3a.medium, which is often crash at errbot restart and need to wait for 6 mins to pull information from slack.
After the change to c5.large, I test 3 times restart just now, no crash anymore, faster to pull information from slack, only 4 mins
I have been receiving the same error message in 2 different bots. (One running locally and one running in a k8s cluster) Utilizing the 'Slack' backend. Unlimited mem/cpu errbot 6.1.7 slackclient 1.3.2
should i be using Slack_rtm instead?
@drahamim The websocket closing issue was resolved on the slackv3 if you'd like to try it out.
Updated the opening post with my current environment and included the slackv3
backend as a workaround to help users finding this issue when looking for answers.
I do hope this issue either can be fixed for for the slack
backend or that the slackv3
backend becomes stable enough for inclusion in the main repository.
Can confirm the connection re-establishing itself is graceful handled as of Errbot v6.2.0 using the SlackV3
backend. Every few hours it logs the following info messages (INFO; not ERROR as before):
2024-01-13 21:08:57,031 INFO slack_sdk.socket_mode.builtin.client The connection seems to be stale. Disconnecting... (session id: <old-session-ID>, reason: disconnected for 25+ seconds)
2024-01-13 21:08:59,230 INFO slack_sdk.socket_mode.builtin.client The connection has been closed (session id: <old-session-ID>)
2024-01-13 21:08:59,230 INFO slack_sdk.socket_mode.builtin.client The session seems to be already closed. Reconnecting... (session id: <old-session-ID>)
2024-01-13 21:08:59,231 INFO slack_sdk.socket_mode.builtin.client Connecting to a new endpoint...
2024-01-13 21:08:59,902 INFO slack_sdk.socket_mode.builtin.client The connection has been closed (session id: <old-session-ID>)
2024-01-13 21:08:59,902 INFO slack_sdk.socket_mode.builtin.client A new session has been established (session id: <new-session-ID>)
2024-01-13 21:08:59,902 INFO slack_sdk.socket_mode.builtin.client Connected to a new endpoint...
2024-01-13 21:09:00,052 INFO slack_sdk.socket_mode.builtin.client Stopped receiving messages from a connection (session id: <old-session-ID>)
2024-01-13 21:09:00,153 INFO slack_sdk.socket_mode.builtin.client Starting to receive messages from a new connection (session id: <new-session-ID>)
2024-01-13 21:09:00,154 INFO errbot.core Activate internal commands
2024-01-13 21:09:00,155 INFO errbot.plugin_manager Activate bot plugins...
The graceful handling is immediately followed by some errors however, but that seems to be caused by the CORE_PLUGINS
issues explained in #1591:
INFO errbot.core Activate internal commands
INFO errbot.plugin_manager Activate bot plugins...
ERROR errbot.plugin_manager Error loading VersionChecker.
(...)
AttributeError: 'NoneType' object has no attribute 'is_activated'
ERROR errbot.plugin_manager Error loading TextCmds.
(etc, etc, etc)
I am...
I am running...
Errbot version: 6.1.8 OS version: Debian (
python:3.9-slim
container) Python version: 3.9 Using a virtual environment: no (using Docker instead) Backend:Slack
Issue description
The Slack RTM websocket closing (Slack goodbye event) causes a Python error. This Python error should be handled more gracefully (ie using some try/except), and instead log a warning when the websocket is closed and reopened.
Steps to reproduce
Run a Slack bot using the master branch, and leave it running for a few hours. After 3-12hr you should get a Python error due to the Slack RTM websocket closed.
Workaround
Switch to the
slackv3
backend.Additional info