brotandgames / ciao

HTTP checks & tests (private & public) monitoring - check the status of your URL
https://brotandgames.com/ciao/
MIT License
1.86k stars 99 forks source link

treshhold for consecutive check failure #155

Closed sebastianfischer closed 11 months ago

sebastianfischer commented 11 months ago

Is your feature request related to a problem? Please describe. We get a lot of "false positive" change alarms, because of short DNS resolution failures ([gettaddrinfo ](getaddrinfo: Try again)), configuration reload etc. While most people would want to be notified if their service is unreachable for even one second, in our case we would prefer only trigger alarms when a service is offline for longer periods of time.

Considered Solution: Threshhold of x consecutive failures If we could configure a setting with a threshhold of how many checks must fail before a notification gets out this would help us immensely

Describe alternatives you've considered For some of the false positives (getaddrinfo) this seams to be related to docker and DNS which we are also looking into: https://github.com/moby/moby/issues/32106 We have also inserted a time window via cron where ciao checks are disabled and we are doing our configuration relaods

Additional context We use Docker and ciao 1.9.4

brotandgames commented 11 months ago

While most people would want to be notified if their service is unreachable for even one second, in our case we would prefer only trigger alarms when a service is offline for longer periods of time.

The period of time you can adjust via cron.

sebastianfischer commented 11 months ago

While most people would want to be notified if their service is unreachable for even one second, in our case we would prefer only trigger alarms when a service is offline for longer periods of time.

The period of time you can adjust via cron.

Sorry if i may have phrased it wrong: This is not a time period issue, but about consecutive check failures. E.g. I would like to get alarm only after 3 checks have failed (because the first two might be a fluke).

brotandgames commented 11 months ago

This is not in the scope of this project.

sebastianfischer commented 11 months ago

OK. I can understand that. Thanks for replying and for your work on ciao. 🫶