I want a dynamic threshold update handler integration so that my alerts become more contextual and relevant.
I aim to reduce alert noise and avoid the redundancy of receiving multiple alerts for the same entities from different checks measuring the same resource. This redundancy overwhelms me and diminishes the value of each alert.
I seek the benefits of automated adjustments, eliminating my need for time-consuming manual threshold modifications.
Background:
As an operator in a large enterprise, I often set memory and other resource utilization alerts based on generic thresholds that might not accurately reflect the needs and behaviors of every entity in my infrastructure. Setting a fixed critical alert threshold can flood me with alerts, especially when I use multiple checks with various thresholds for different entities. This often results in alert fatigue, where I become desensitized to alerts and possibly miss genuinely critical incidents.
Proposed Solution:
I propose the introduction of a Threshold Update Handler that:
Monitors the frequency of alerts for each entity.
Dynamically adjusts the critical alert thresholds based on an entity's past behavior.
Allows me to set a base and maximum allowable threshold, which the handler can adjust within.
By implementing this solution, I expect to experience more tailored alerting, reducing unnecessary noise and improving the efficiency of my monitoring setup.
User Story:
As a Sensu operator,
I want a dynamic threshold update handler integration so that my alerts become more contextual and relevant.
I aim to reduce alert noise and avoid the redundancy of receiving multiple alerts for the same entities from different checks measuring the same resource. This redundancy overwhelms me and diminishes the value of each alert.
I seek the benefits of automated adjustments, eliminating my need for time-consuming manual threshold modifications.
Background:
As an operator in a large enterprise, I often set memory and other resource utilization alerts based on generic thresholds that might not accurately reflect the needs and behaviors of every entity in my infrastructure. Setting a fixed critical alert threshold can flood me with alerts, especially when I use multiple checks with various thresholds for different entities. This often results in alert fatigue, where I become desensitized to alerts and possibly miss genuinely critical incidents.
Proposed Solution:
I propose the introduction of a Threshold Update Handler that:
Monitors the frequency of alerts for each entity.
Dynamically adjusts the critical alert thresholds based on an entity's past behavior.
Allows me to set a base and maximum allowable threshold, which the handler can adjust within.
By implementing this solution, I expect to experience more tailored alerting, reducing unnecessary noise and improving the efficiency of my monitoring setup.