grafana / oncall

Developer-friendly incident response with brilliant Slack integration
GNU Affero General Public License v3.0
3.49k stars 288 forks source link

Notifications when Alert is Resolved #3057

Open mattparkes opened 1 year ago

mattparkes commented 1 year ago

What would you like to see!

I would like to see a way to receive a notification when an alert is resolved. This is a standard feature offered by the vast majority of competitors in this space.

Whilst it is technically possible to throw something together with an Outgoing Webhook with a Trigger Type of Resolved, I think this should be a first-party feature.

I would expect this to be configured under Alerts IRM > On-Call > Users > View My Profile (e.g /a/grafana-oncall-app/users/me) and there to be a section below Default Notifications and Important Notifications called something like Resolution Notificaitons with a single Notify By dropdown (where I can select all the typical things like Slack/SMS/Email/Phone etc).

Presumably there would either be a "None" option here for people who don't want resolution notifications, or a toggle box to turn this feature on for that user.

Product Area

Alert Flow & Configuration

github-actions[bot] commented 1 year ago

The current version of Grafana OnCall, at the time this issue was opened, is v1.3.38. If your issue pertains to an older version of Grafana OnCall, please be sure to list it in the PR description. Thank you :smile:!

roock commented 1 year ago

Related to #2926

ravishankar15 commented 10 months ago

I checked the code, the separation of default vs important notification is implemented as boolean and hence makes it tricky to add additional notification to the user settings. Hence I feel we need some ground work to be done before we raise a PR for the new implementation, The way I am thinking is,

  1. Add a category field to the UserNotificationPolicy which holds the data for the default vs important notification. And for any new user both the important and category field will be rightly populated.
  2. For older users the category field will be null for which we need to run a migration to update the category field for existing users based on important field(category=0 if important is false, category=1 if important is true). (Till this point we are not making use of category field)
  3. Update the code to deprecate the use of important flag with the category field.
  4. Raise a PR for the implementation of the Resolution Notifications as a new category in the UserNotificationPolicy.

I have raised a PR for (1 & 2) https://github.com/grafana/oncall/pull/3591

I can raise the next consecutive PR's once we decide to move forward with this approach.

Konstantinov-Innokentii commented 7 months ago

Hi! Here is how you can make OnCall post messages to slack thread when AlertGroup is Resolved using Outgoing webhook.

  1. Install Incoming WebHooks slack app to get an url to post a message.
  2. Create an Outgoing webhook in OnCall and set triger type to "Resolved" and use url you obtained at the first step.
  3. I used this webhook's data template to form a message payload:
    {%- set payload = {} -%}
    {%- set payload = dict(payload, **{"text": "Resolved "+alert_group.title}) -%}
    {# encode payload dict to json #}
    {{ payload | tojson }}

    This workaround doesn't mean, that the feature described here is irrelevant, it's just to provide some alternative solution.

6fears7 commented 2 months ago

Having a Resolved message sent through would be a huge boost to Outgoing Webhooks.

Currently, the logic only allows for the usage of webhooks with type, "Escalation step".

https://github.com/grafana/oncall/blob/22cd4b86fc59261e7db98ebd7d6f8a0a0a0e7420/grafana-plugin/src/components/Policy/EscalationPolicy.tsx#L478

We found a bug that enables us to use alternate options like, "Alert Group Status Change" as the webhook type in the integration which enables us to receive both a firing alert and a resolved alert in our Escalation chain.

Having this functionality as the default in the webhook integration setup would go a long way in the experience.