SigNoz / signoz

SigNoz is an open-source observability platform native to OpenTelemetry with logs, traces and metrics in a single application. An open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open source Application Performance Monitoring (APM) & Observability tool
https://signoz.io
Other
18.44k stars 1.17k forks source link

Different alert recovery threshold #2811

Open flaviut opened 1 year ago

flaviut commented 1 year ago

Is your feature request related to a problem?

A lot of times, an alert will rapidly toggle between alerting and being resolved. For example, the disk space usage here is marginal, and kept firing on & off:

image

Describe the solution you'd like

A separate recovery threshold would add some hysteresis to the alert and keep it active until the underlying problem is solved. This also happens to be the same technique Datadog uses.

Describe alternatives you've considered

Requiring an alert be alerting for a specific amount of time is not enough in this case, since this alert already had a time period of 1hr set.

Additional context

Add any other context or screenshots about the feature request here.

welcome[bot] commented 1 year ago

Thanks for opening this issue. A team member should give feedback soon. In the meantime, feel free to check out the contributing guidelines.