BlakeWilliams / Elixir-Slack

Slack real time messaging and web API client in Elixir
MIT License
674 stars 183 forks source link

Bot crashes when user joins a channel #220

Closed keathley closed 4 years ago

keathley commented 4 years ago

It looks like the bot module still expects to have knowledge of channels and users. When a channel join event is received it attempts to update the bot's internal state. But this data was removed in #177 and causes the bot to crash.

As an aside, I'm not sure how this is a reasonable solution. It's not possible to update the bots slack state in the init callback. The only return value is the users state. If the expectation is really to do these calls, inline, for every request, that means the bot will only be able to receive 4 text messages a minute before a user hits their rate limit. Obviously, it's possible to make more targeted requests and to cache the results. I'm using both of these strategies in fawkes. But that leaves me wondering if the bot is supposed to be doing this sort of book keeping still. It seems to me like most of the slack state code can be removed but maybe I'm missing something.

dylanmei commented 4 years ago

With a decent-sized Slack workspace, crashing due to this problem is near-constant. How can we initialize things here in Slack.Bot with reasonable defaults so that we don't badmap?

09:32:20.055 [error] ** Websocket client Slack.Bot terminating in :websocket_handle/3
   for the reason :error:{:badmap, nil}
** Websocket message was {:text, "{\"type\":\"user_change\",\"user\":{\"id\": ....}}
** Stacktrace: [{Map, :get, [nil, "UABCXYZ", %{}], [file: 'lib/map.ex', line: 450]}, {Access, :"-key/2-fun-0-", 5, [file: 'lib/access.ex', line: 462]}, {Map, :get_and_update, 3, [file: 'lib/map.ex', line: 837]}, {Access, :get_and_update, 3, [file: 'lib/access.ex', line: 346]}, {Kernel, :update_in, 3, [file: 'lib/kernel.ex', line: 2351]}, {Slack.Bot, :websocket_handle, 3, [file: 'lib/slack/bot.ex', line: 156]}, {:websocket_client, :handle_websocket_frame, 2, [file: '/home/foo/deps/websocket_client/src/websocket_client.erl', line: 465]}, {:gen_fsm, :handle_msg, 8, [file: 'gen_fsm.erl', line: 497]}]
acconrad commented 4 years ago

Hi @keathley part of the upgrade was due to how the Slack API operates. Needing to switch functions was a requirement coming from Slack rather than how we wanted to connect to the app. It is necessary to reduce the amount of bookkeeping in favor of making specific calls to things like Users and Channels.

Further, rate limits have been more strictly enforced over the last few months with the need for pagination for collections is now a standard for most standard models in Slack.

Do you have a recommendation for how you would like this fixed? Could you add in a PR to resolve this?

keathley commented 4 years ago

Sorry, I'm not questioning why these calls were remove. I understand the reasoning for reducing the bookkeeping that this library is doing. I'm confused about what the expectations are for users of this library. More importantly, I'm wondering if this is a bug or if I'm doing it wrong.

There is no supported way to update the slack bot's internal state. You could issue all of these calls when the bot boots and store it in the user's state. But, as stated in the issue, the bot expects this data to be in its internal state. It's also not possible to issue each of these calls on a per-message basis due to the rate limits. It seems to me that the proposed solution is for the user to manage whatever slack state they require. If that's the case then it seems that there's no point in having the bot do any bookkeeping. Which is fine. But it's not clear what's expected of the user here.

For Fawkes, I ended up writing my own slack adapter that manages the bits that I need.

acconrad commented 4 years ago

Aha okay makes more sense @keathley I believe I had addressed this in the breaking changes update but essentially you'll need to make individual calls to grab the latest data. So if the original one grabbed all of the users and all of the channels, you'll need to make individual calls now to Channels.list and Users.list

keathley commented 4 years ago

@acconrad Sorry, I'm still not being clear. The issue is that the bot crashes whenever a user joins or leaves a channel because the bot is trying to update its internal state which is no longer created.

barkerja commented 4 years ago

The issue is that the bot crashes whenever a user joins or leaves a channel because the bot is trying to update its internal state which is no longer created.

Joining the conversation to add my personal confusion on this topic.

I recently setup a bot and noticed it kept crashing on a lot of user status changes, joins, etc. After reading the documentation and recent changes, I thought it was simply a matter of making those additional Slack calls in your handle_connect/2 callback to populate the state, which to my surprise, did nothing to help the issue.

entertainyou commented 4 years ago

I think this issue could be fixed by #223 , at least the case for @dylanmei

dylanmei commented 4 years ago

Oh nice! I've been running a fork that has almost this exact change. There are still a couple messages that crash the bot like this:

Websocket client Slack.Bot terminating in :websocket_handle/3
   for the reason :error:%ArgumentError{message: "could not put/update key :name on a nil value"}

The messages that I have logged are:

acconrad commented 4 years ago

@dylanmei I'm pretty active on approving PRs so if you have a proposal on how to fix it please go ahead and submit and we'll get it up on Hex ASAP!

entertainyou commented 4 years ago

I'll see if I can reproduce and maybe fix issues @dylanmei provided.

acconrad commented 4 years ago

@entertainyou @dylanmei just pushed another fix so if you pull 0.23.4 that may resolve this Issue here and we can close this?

dylanmei commented 4 years ago

:+1: My bot's workspace has 12k users and 9k channels. After 3 hours on weekday I've seen no issues.

@entertainyou @acconrad thank you!

acconrad commented 4 years ago

Great I'm going to close this issue since that's a pretty hefty test to ensure the bot isn't crashing