mbecker20 / komodo

🦎 a tool to build and deploy software on many servers 🦎
GNU General Public License v3.0
1.21k stars 20 forks source link

[Feature] Alerts - Ignore short burst of system load #65

Open TomKauffeld opened 1 week ago

TomKauffeld commented 1 week ago

Summary

Add a way to not send an alert on high system usage (cpu for example) when it's only for a short amount of time.

Situation

During normal operation of certain servers, the load on the cpu can reach 100% during a short period. For example during indexing of resources, or during build operations.
However at the moment Komodo will directly send an alert if it reaches a specific threshold.
This means the alert list will be filled with non useful alerts and can make us numb to real alerts.

Suggestion

Add a counter / timer so that the alert will only be send if the threshold is reached for a specific amount of time (for example, cpu usage above 80% for 5 minutes).
This way the short bursts of system load will be ignored, but if the system load remains high, an alert will be send.

LawMixer commented 1 week ago

This is already a feature - if you go into the "Servers" tab and then select the server you want to edit, and then go to "Alerts" tab, you can edit and disable/enable how you want the alerts to go out.

TomKauffeld commented 1 week ago

At the moment you can enable/disable the alert and set the thresholds for a warning and critical alert, however not if the threshold is sustained for a period of time (unless I'm missing something).
This feature would allow to not send a alert if the CPU reaches 90% for only a few seconds, but only if it's sustained for longer than a specified time period.

LawMixer commented 1 week ago

This is normal - the alert will go off when the server is under the thresholds.

mbecker20 commented 1 week ago

It's a good point, sustained high cpu usage can be a better indicator than just a single high cpu measurement. But it is a bit more complex to implement, I will have to consider the best way to do this.

Also note, you can set cpu warning and critical threshold each to 100, this efficiently disables alert generation for high cpu. I know this is different than what you want but thought I would note it.