slack-ruby / slack-ruby-bot

The easiest way to write a Slack bot in Ruby.
MIT License
1.12k stars 187 forks source link

Duplicate responses to commands after bot is left idling w/ celluloid #236

Closed jillguyonnet closed 9 months ago

jillguyonnet commented 4 years ago

Hi there,

We are using this gem in a simple project that uses a database. We've noticed that when we let the app idle for some time (unsure exactly for how long, maybe about an hour) then the bot processes text commands multiple times. The gems slack-ruby-bot and slack-ruby-client are on the latest version.

I have created a minimal setup with which I am able to reproduce this issue: https://github.com/jillguyonnet/slack_ruby_bot_minimal. This is a very simple app that manages a list of items. Note that it was based on https://github.com/slack-ruby/slack-ruby-bot/blob/master/TUTORIAL.md and uses the MVC approach. The available commands are:

Similar to our real project, the issue of multiple command processing happens when I let the app run for a while. Here is a screenshot where the bot responds twice to hi:

Screenshot 2019-09-09 at 16 33 58

The steps to reproduce are:

  1. Run the app, check that it works by running some commands.
  2. Let it running for a while (at least an hour but possibly several hours).
  3. Run some commands again. More often than not (but not always), the issue will present itself.

We are currently investigating potential sources for this issue. Without confirmation, we think these might include:

Would you, based on this information, be able to suggest the cause of this issue and the best way to fix it?

Thanks

hjanuschka commented 4 years ago

same here! K3N_-_Kubernetes_Dashboard

dblock commented 4 years ago

Without debugging I am going to guess that it's reconnecting, and re-registering commands when that happens. I'll take a look when I get a chance.

hjanuschka commented 4 years ago

yeah seems to be that way, i would love to have an option/ENV to disable reconnect et-all and just throw a exception and die! in our case the cluster would take care of the bot and recreate it anyway 👍

hjanuschka commented 4 years ago

if you can lead me to the file/place where the reconnect happens, i am happy to contribute such a change!

dblock commented 4 years ago

Reconnect happens in slack-ruby-client with a ping worker. There's a lot of detail in https://code.dblock.org/2019/03/04/solving-slack-side-disconnects-in-slack-ruby-client.html with links. But I think the problem here is simpler and is something about the reconnect semantics changing over the last few versions of the client. I would add a bunch of logs to see what's being reloaded and when.

hjanuschka commented 4 years ago

changing to CONCURRENCY=async-websocket seems to fix it for my scenario, it reconnects multiple times but only a single bot is responding to messages

dblock commented 4 years ago

Were you using celluloid before or faye-websocket?

hjanuschka commented 4 years ago

default, i think it was celluloid

jillguyonnet commented 4 years ago

We're happy to report that after a week of testing using async-websocket and CONCURRENCY=async-websocket seems to have fixed the issue for us as well. 🎉

dblock commented 4 years ago

I would appreciate if someone could get to the bottom of this with celluloid. Good project to dive deep!

oliverswitzer commented 4 years ago

+1! Have also been experiencing this behavior after switching to celluloid-io.

If I have uninstalled celluloid-io and installed async-websocket again, do I still need to set CONCURRENCY=async-websocket or will async-websocket get picked up as a default?

dblock commented 4 years ago

+1! Have also been experiencing this behavior after switching to celluloid-io.

If I have uninstalled celluloid-io and installed async-websocket again, do I still need to set CONCURRENCY=async-websocket or will async-websocket get picked up as a default?

If your Gemfile has async-websocket you're all good. The ENV setting is to run tests in this project.

Startouf commented 4 years ago

We've left our bot alone this weekend, and when coming back to work this monday, it starts replying 12 times for the same question 😱. Will try the async-websocket trick.

oliverswitzer commented 4 years ago

@dblock thanks! Though I had to upgrade to Rails 6 because of a version conflict with, switching back to using async-websocket in my Gemfile seemed to work.

dblock commented 4 years ago

I wish someone actually fixed the celluloid bug. Or maybe someone can do the work to deprecate celluloid usage everywhere and hardcode async-websocket?