rampatra / jbot

Make Slack and Facebook Bots in Java.
GNU General Public License v3.0
1.2k stars 352 forks source link

Same events are received in all JBot instances #144

Open mmartinadanx opened 5 years ago

mmartinadanx commented 5 years ago

Hello,

I have created a Slack Bot and deployed it to a K8s cluster. I have created multiple instances of it. When using RTM, all instances receive the same events, and therefore reply the same messages.

Is this the expected behaviour? Is there any feature that allows JBot instances to compete for messages, so that an event is only received by a single instance?

Thanks!

rampatra commented 5 years ago

Hi @mmartinadanx, this is an interesting problem. Is there some kind of load balancer in front of all the bot instances?

Also, do keep in mind that, in a typical setup, each bot instance opens up a web socket connection with Slack, and not your regular HTTP connection, so whenever Slack sends an event, it is forwarded to all the bot instances.

I know about scaling applications communicating over HTTP but I am no expert when it comes to WebSocket scaling. Can you give a read at these: https://deepstreamhub.com/blog/load-balancing-websocket-connections/ and https://stackoverflow.com/questions/12526265/loadbalancing-web-sockets and let me know if it makes sense.

I will update the ticket if I find a better solution.

mmartinadanx commented 5 years ago

Thanks for your prompt reply, @rampatra.

I think this should be managed by the application, preferably by Slack. Having many instances of the same service is a typical approach for high availability and load balancing, but my expectation was a logical partition of the events, so that order guarantees are kept. (i.e. Messages from the same conversation are received by the same instance). That can't be done by a LB.

I will update this as well if I find something. Thanks again.

rampatra commented 5 years ago

Hey, now that I am thinking again, I realized that each bot instance will first make a call to https://slack.com/api/rtm.connect?token={token} with the token and each of them will get a different socket url to connect to. So, as you said, it would make sense to me if Slack sent events only to the instance where the request came from, but I am not sure why Slack doesn't work this way. Am I making sense?

And regarding this message of yours:

Is there any feature that allows JBot instances to compete for messages, so that an event is only received by a single instance?

I am not sure that I follow. Each individual instance should work as if there are no other instances. They shouldn't have any knowledge of (or dependency with) the other instances. One way to achieve what you're saying is by using sticky sessions if I am not wrong. If that's the case, it has to be done in a software-based load balancer which keeps a track of where the request came from. Having said that, I have no idea how would I do that in web sockets.

mmartinadanx commented 5 years ago

As you say, each individual instance should work as if there are no other instances. This is why I think that this feature should be provided by Slack. I couldn't find it in the Slack API and was hoping that you already faced this issue.

Some event stream platforms use the concept of partitioning and consumer groups. This allows consumers to subscribe to event channels competing with each other (the event is only sent to one consumer in the group) or in a pub-sub fashion (events are sent to all consumers).

I'm not sure about how a LB can achieve this, but I'll investigate.

rampatra commented 5 years ago

Sorry, I use just one instance of JBot at this time. My load isn't big enough to require more than one instance. Ergo, I thought this is an interesting problem to solve. Thanks for reporting this.

Yes, if slack doesn't handle this on its side, we have to achieve this somehow by a load balancer which would keep track of where the messages came from. Or there may be some better way. Let's keep digging. We can update here on our findings.

rajeshkarnena commented 4 years ago

Hi @rampatra @mmartinadanx I'm facing the same issue. Just bumping up this thread to know if you guys found any solution to this or an alternative to get this working? Any help is much appreciated. Many thanks.