medic / cht-watchdog

Configuration for deploying a monitoring/alerting stack for CHT
GNU Affero General Public License v3.0
4 stars 7 forks source link

Clean up default format of alerts (if possible ;) #46

Open mrjones-plip opened 1 year ago

mrjones-plip commented 1 year ago

Now that we've enabled alerts (#14), it turns out they're very verbose.

This is what is seen right now:

[FIRING:28] DB Fragmentation CHT (cht) FiringValue: B=27.055714740528494, C=1 Labels: - alertname = DB Fragmentation - db = _users - grafana_folder = CHT - instance = INSTANCE-HERE.medicmobile.org - job = cht Annotations: - description = The [_users] database for the CHT Server [INSTANCE-HERE.medicmobile.org] is highly fragmented. Source: https://allies-monitoring-alerting.dev.medicmobile.org/alerting/grafana/ot6lYCYVz/view?orgId=1 Silence: https://allies-monitoring-alerting.dev.medicmobile.org/alerting/silence/new?alertmanager=grafana&matcher=alertname%3DDB+Fragmentation&matcher=db%3D_users&matcher=grafana_folder%3DCHT&matcher=instance%3DINSTANCE-HERE.medicmobile.org&matcher=job%3Dcht Dashboard: https://allies-monitoring-alerting.dev.medicmobile.org/d/oa2OfL-Vk?orgId=1 Panel: https://allies-monitoring-alerting.dev.medicmobile.org/d/oa2OfL-Vk?orgId=1&viewPanel=13

And it'd be great if we greatly condense it and remove unused info:

Like so:

[FIRING:28] DB Fragmentation: [_users] database on INSTANCE-HERE.dev.medicmobile.org.

Value: 27.055714740528494, Alert threshold: 1

Links: Source - Silence - Dashboard - Panel

mrjones-plip commented 1 year ago

This post might help.

eljhkrr commented 4 months ago

Notifications reduced to single-line messages: image

TODO: Default message formatting needs to be upstreamed