TwiN / gatus

⛑ Automated developer-oriented status page
https://gatus.io
Apache License 2.0
6.07k stars 407 forks source link

Repeating notifications #379

Open gouku opened 1 year ago

gouku commented 1 year ago

Describe the feature request

Ability to send repeated notifications every x min/hr if the endpoint is still offline/dead.

Why do you personally want this feature to be implemented?

We now had over 300 nodes monitoring via Gatus and we received the notifications via Slack and Telegram. One scenario is when some nodes are offline during the weekend. We always forget to fix them in the next workday.

I've tried some other services and seems like it's a common feature to alert repeatedly if something still goes wrong.

How long have you been using this project?

No response

Additional information

No response

TwiN commented 1 year ago

Sounds like a good idea!

Something like this:

    alerts:
      - type: slack
        enabled: true
        description: "healthcheck failed 3 times in a row"
        send-on-resolved: true
        repeating: true

That said, not all alerting providers should allow this. For instance, pagerduty incidents need an acknowledgement on resolve, so we probably don't want to trigger multiple alerts simultaneously for pagerduty. For Slack, Discord, Teams, Telegram, etc. it shouldn't be an issue.

diamonddelt commented 1 year ago

Bumping this because we'd also be interested in having this feature, particularly for the MS Teams alerting hooks. We already use send-on-resolved, but if a service was down for 6+ days and the small team missed the initial message sent to a noisy MS Teams channel, it's easy to miss or forget unless it periodically reminds you it is still failing.

mxcd commented 6 months ago

From what I can see, #614 is implementing this functionality ready for review