guardian / mobile-n10n

n10n for nOTIFICATIOn
Apache License 2.0
26 stars 4 forks source link

Increase evaluation period for http code alarms #1183

Closed tkgnm closed 11 months ago

tkgnm commented 11 months ago

What does this change?

A couple of our HTTP error alarms often go off and then immediately OK themselves, causing noise in our channel. By increasing the evaluation period we should be able to reduce this noise.

It appears they sometimes alarm because an instance becomes unhealthy, results in a connection error and then needs to boot up a new instance. Increasing the evaluation period should give it time to boot up a new instance. If it can't, we should then see this alarm after 10 minutes of there being an unhealthy instance.

Screenshots

Note below how the times correlate between us seeing errors and there being an unhealthy instance.

Screenshot 2023-11-10 at 14 26 53 Screenshot 2023-11-10 at 14 35 56