centrifugal / centrifugo

Scalable real-time messaging server in a language-agnostic way. Self-hosted alternative to Pubnub, Pusher, Ably. Set up once and forever.
https://centrifugal.dev
Apache License 2.0
8.28k stars 584 forks source link

[question] Is there any way to collect events through webhooks or whatever? #195

Closed YOxan closed 3 years ago

YOxan commented 6 years ago

I would like to use the Centrifugo project, but didn't find endpoint to be notified about inner system events. There is a 'channel info' and server stats, but it is not a good way to collect metrics in a sync manner (through making request to API every second for example). Is there any way to collect the connection/disconnection/etc. events with, let's say, redis or other pub-sub broker? We need them to control channels by our backend.

FZambia commented 6 years ago

@YOxan hi! Unfortunately there is no way to do this at moment. Could you describe your use case for this in detail? Why you need to control channels by backend and how events can help with this.

YOxan commented 6 years ago

@FZambia thank you for your reply! We wanted to utilize it to make some intercommunication between participants of room. The channels will be bound to an inner room model. There are two roles of participant of the room: roleA and roleB. We need to track events of connection/disconnection of users with such roles to collect them for the further analysis and to close the room if participant with roleA will absent in room (on channel) more than 1 minute for example. Maybe you know some way how we could deal with that? Thank you!

FZambia commented 6 years ago

When user with role A enters a channel you can periodically send request from frontend to your backend indicating its presence in that room - you just update some time field in your main database. Then on backend side you can have a cron job that deletes rooms without activity for some time.

You also can ask for presence in room from Centrifugo periodically but I personally don't like this idea - this does not look as efficient and beautiful way.

Relying on connect/disconnect event stream seems like the most unreliable solution because as soon as you miss one message you will get a room hanging in open state. I.e. you will need some extra mechanism to check it anyway.

Hope I understood your case right.

YOxan commented 6 years ago

Ok, I see. That's an option, I just thought there is some predefined mechanism to notify third-party service from Centrifugo. Thank you.

FZambia commented 6 years ago

@YOxan hi, what a status for this - what you finally ended up with?

YOxan commented 6 years ago

Hello! We have found that we can't go without events from the message exchange part, so it's reasonable to look for something else or develop our own bicycle) Thank you!

cheddarwhizzy commented 6 years ago

I have a use case where I'd like to know on the backend if a user has connected or disconnected. Right here https://github.com/centrifugal/centrifugo/blob/80db50873c1b5478aac3eaf5dc012b1389c808c3/libcentrifugo/server/handlers.go#L166 is where I'd like to trigger a webhook when the client disconnects. I could also add a webhook when the client connects as well.

I'll get a PR going if this is something we can get into master. Otherwise I'll probably end up building a custom solution.

I'm basically using a single connection per client, and broadcasting the messages to connected clients. If a user has either lost connection or closed the app, I'd like my backend to be aware so I can take necessary actions (send notifications of new messages to that user, set user status to "away" or "online", etc..)

FZambia commented 6 years ago

@cheddarwhizzy hello, I won't accept such pr without full proposal taking into account some moments:

There could be some other tricky moments, at moment I am not convinced that this must be added into Centrifugo.

cheddarwhizzy commented 6 years ago

In regards to a failing centrifugo node, wouldn't the node fail and not trigger any handlers? Therefore no webhooks would be triggered. As for the massive reconnect after a failure, if there a "state" for each client stored somewhere (redis?), then centrifugo technically could disregard this reconnect if the client ID hasn't changed. But if the outage was for even 5 seconds, that's potential for lots of lost messages. So triggering the reconnect webhooks could be a good thing depending on the application.

As far as the webhook not being received and handled in my application properly without error I don't see being a concern for centrifugo. Just like if my Jenkins is down when a dev commits to Github, not Github's problem :)

Is this something we can enable via config option(s)?

FZambia commented 6 years ago

As soon as this will be merged it becomes Centrifugo and my problem. So a real proposal with concrete solutions required here.

You are right - when node killed there won't be any hooks at all. But on controlled shutdown Centrifugo tries to gracefully disconnect users. Both cases must be handled somehow.

Client ID issued by Centrifugo for every new session, client state is mostly in process memory. Client can subscribe to another node after reconnect.

cheddarwhizzy commented 6 years ago

Thanks for the feedback. Here is my attempt at a proposal. I used the golang proposal template. Hopefully this covers what you're looking for :)

Abstract

Implement webhooks on configured Centrifugo events, enabling other server-side services to react accordingly.

Background

Currently, Centrifugo is in it's own bubble. Whatever happens in Centrifugo, stays in Centrifugo. This can be an issue if you have a real-time application that needs to know when users connect/disconnect, publish messages, . For example, if a user connects, change his/her status to "online" and send all relevant data from the backend to the user via websocket connection. If a user disconnects, send relevant data via push notification, email, etc and set user to "offline". Another example would be if a user publishes a message directly to the Centrifugo API rather than POST to your API and push to Centrifugo, you may want to relay that message to a data warehouse for machine learning or other purpose.

Proposal

Add a "hooks" or "webhooks" package and add it to the relevant handlers. . Config to enable this functionality could be

  "webhooks": true,
  "webhook_url": "https://some-service.domain.com/webhook",
  "enabled_webhooks": ["SockJsConnect", "SockJsDisconnect", "RawWsConnect", "RawWsDisconnect", "PublishMessage", "BroadcastMessage", "Unsubscribe", "Subscribe"]

Having separate webhook URL's per event is another option for higher performance. Omitting "url" could default to a single endpoint

default_webhook_url: "https://default-webhook-service.domain.com/webhook",
webhooks: [
  {
    enabled: true,
    name: "SockJsConnect",
    url: "https://connect-webhook-service.domain.com"
  },
  {
    enabled: false,
    name: "BroadcastMessage",
    url: "https://message-webhook-service.domain.com"  
  },
  {
  enabled: true,
  name: "Subscribe"
  }
]

When decommissioning a node gracefully, the webhooks for connect/disconnect should be ignored because they will be reconnected to another available node.

When a node drops out due to being unresponsive, the clients would reconnect to a new node also ignoring the connect webhook since it was previously connected. The client has "isResubscribe" for this. If the app is using raw websocket, this will have to be handled by the webhook service or via connection parameters such as wss://centrifugo.domain.com/connection/websocket?notifyConnect=false

Rationale

I can't think of any alternate approaches to this problem other than to write a custom websocket service that has this functionality. A proxy service that sits in front of Centrifugo for handling direct publishes/broadcasts could work, but is not ideal. Having a 2nd websocket connection to another service for handling online/offline status, but then you might as well go custom solution. There aren't really any disadvantages to having this webhook functionality as a feature of Centrifugo that I can think of.

Compatibility

This change should not affect existing functionality. This means any users currently using Centrifugo should be able to upgrade to this version using their existing config without running into any issues. Webhooks would be configurable via the config.json file

Implementation

I could begin this after the MVP release of the app I'm working on as this isn't a make or break feature. Development would begin about the start of Feb 2018.

cheddarwhizzy commented 6 years ago

After doing a bit more research, I came across a project called Melody which has functions for each of these events so it seems to be way less work to create a custom web socket api using Melody vs diving into Centrifugo code. If you like the proposal and think this would be a good feature for Centrifugo, I’d be happy to contribute cuz I really like this project. If not feel free to close this issue and thank you for your time :) cheers!

FZambia commented 6 years ago

@cheddarwhizzy thanks for your work! Your proposal misses some important parts about how application should behave and adopt to this behaviour to reliably work with connections. For example how you are personally going to deal with missed hooks. If you already have a plan and describe it I will really appreciate.

Currently I am working on Centrifugo 2 in separate c2 branch. There are some ideas for new version release - one of which is let Go developers use Centrifugo as library. This is not a simple task, but I hope I'll succeed in this. If yes - it will look similar to Melody in some aspects - though different and more difficult because of lots of Centrifugo specific backgrounds and features. This can theoretically give a Go developer possibility to add custom logic to internal events. If you want to participate and help with library API design - we can discuss it further (in some chat, on Gitter for example - just write me a message there).

cheddarwhizzy commented 6 years ago

I actually wasn't considering a failure mechanism in Centrifugo for the failed POST request to the webhook endpoint. I'd imagine if the webhook endpoint went down, there could be tons of failed requests and data that would need to be queued for retry and that'd be a bit overkill for what I was thinking. Having a HA webhook endpoint that processes the data could handle retries by sending data straight to a bus (kafka, rabbitmq, or nsq) and processing the message accordingly using workers/consumers. With that said, webhooks was more of a quick and dirty way of getting data across services written in any language that speak HTTP. And I was piggy-backing off this issue hehe.

The library idea would probably perform better. I'd imagine another websocket connection straight to the Centrifugo server subscribing to all or selected events, or a grpc implementation. The only downside to the library is that it'll need libs for all languages. Using any websocket client (from say a nodejs service) or having a RPC server to receive events is what immediately comes to mind, but I'm sure you've thought about it a lot more and have some solid ideas.

FZambia commented 5 years ago

With Centrifugo v2 release this issue is not fixed. I am still not sure webhooks should be in Centrifugo. There is a possibility to implement system similar to Centrifugo using centrifuge library - building webhooks on top of it looks simple task (at least in a simple MVP form). If someone will come to a nice webhooks design on top of Centrifuge lib we can discuss is it possible to backport it to Centrifugo. This is the best advice I can give at this moment.

tlof commented 5 years ago

I would like to add one more ideas about this. Forget the webhooks. Only put events in a redis que.

Pro:

Events i interested: connect/disconnect (ip, userid, connectionid), messages(userid, channel, connectionid, message content). if you put these in two separate que, and i can enable / disable to write the events so if i don't care the logon / logoff events, only the sended message i don't need to write simple daemon that just drop messages from the connect/disconnect que.

FZambia commented 3 years ago

Closing.

We now have some hooks to control client connections - see proxy feature.

I don't want to add hooks that will fail fast and can lead to incorrect application behaviour eventually without straightforward way to deal with. For example, disconnect hooks.

I also don't see a simple way to reliably send events to any queue. This can be useful for some sort of analytics – but not for business logic. There are some ideas to add a sort of real-time analytics with ClickHouse (see Centrifugo Pro). But this is a bit unrelated story.

Also Centrifuge library for Go can be used to build a server with required hooks.