Slack Connector improvements

elastic / kibana

Your window into the Elastic Stack

Other

19.69k stars 8.24k forks source link

Why? Slack is a common tool for communication across the organization, our customer's teams leverage slack channels for alert IR, escalations, investigation, and more. What? In order to support these customers' use cases and streaming communication and alert resolution over slack, there are several improvements that are needed. Currently, our Slack connector is dedicated to a channel, so customers have to manage & setup (asking cred from admins, etc) several connector instances for their Slack channels (which becomes inconvenient because there are dozens of channels). And How? The following use cases should be supported:

Slack setup (Connector page & Action config):

A single Slack connector is defined for all slack channels. This means the target channel should be part of the action config.
The webhook URL disappeared after saving the connector config, we should keep it there without cleaning this input field.
The Slack action config should provide the full list of existing channels for the Slack connector instance, in order to help users to find the right slack. A text filter on this list should be supported as well.

Slack message behavior:
- When the associated Kibana alert is, another slack action should be able to be fired in order to mark the original slack message as recovered.
- We should support a better msg visualization (see some exam below) based on slack Templates: colors for active/recovered alerts Buttons: Link to Kibana alert , Recover
- field name : value
- Msg structuring

The following fields should be available for the send payload: source name Rule metadata (detect time, rule name, id) *Raw alert fields

[Nice to have] The Slack action should support a new option of creating the channel (as part of the action trigger) if it doesn't exist in order to support dynamic channels. And the name input should support alert fields params. For example: When a rule detects an alert for Host X, the Slack action will be defined with a dynamic name, e.g "Alert.host- investigating" and if this channel doesn't exist, we'll create it with the action. We should have a checkbox in the connector setup in the Connector page that enables this feature so we reduce situations when users do that by mistake as part of the action config

A couple of notes:

A single Slack connector is defined for all slack channels. This means the target channel should be part of the action config.

We are currently using the "Incoming Webhooks API" to post Slack messages, via the @slack/webhook package at https://www.npmjs.com/package/@slack/webhook . This API forces you to pick the channel when you register the webhook, so we'll need to use a newer API - I believe that's referred to as the "Web API".

The webhook URL disappeared after saving the connector config, we should keep it there without cleaning this input field.

The reason for this is that the Webhook URL contains a secret, so we have to "hide" it during connector editing. It appears the newer Web API supports explicit auth (OAuth bearer tokens), so we likely won't have to hide the URL anymore - in fact, presumably it won't even be needed, though it's not clear if perhaps there are "on prem" Slack servers that would need a customized non-slack.com URL.

When the associated Kibana alert is, another slack action should be able to be fired in order to mark the original slack message as recovered.

This would be nice, but seems super-hard; presumably we'd get some id back from the "active" message, we could then use when posting the "recovered" message. That means we'd need to store that message id for every Slack action for every alert, persisted with the rule. Making it even harder, we currently "throw away" the results of running actions, so we'd need some way to "capture" the results, so they could be persisted for later use.

I wonder if we could instead "search" for those messages - it would be awesome if we could add "metadata" to a message, like the rule id, alert id, etc, and then search based on that. Otherwise perhaps the existing search API could be used to do that; presumably we'd have a ruleId/alertId key somewhere in the content we can search on.

We should support a better msg visualization (see some exam below) based on slack Templates

Yeah, using BlockKit. But are we expecting users are going to be typing in BlockKit JSON? In any case, this feels like a separable piece from the rest of the requirements here.

And in general, I'm wondering if we're going to need ANOTHER connector, as this one is going to be very different from the existing one.

elastic / kibana

Slack Connector improvements #147842