sensu / sensu-extensions-occurrences

The Sensu Core built-in occurrences filter extension
MIT License
9 stars 10 forks source link

Don't reset occurrences when a result is swapping between warning and critical #9

Open Tuxem opened 6 years ago

Tuxem commented 6 years ago

Hi,

Sometimes, a check can change just for a short period of time from WARNING to CRITICAL. Problem is that if it's just once and then goes back to WARNING, occurrences will start from scratch and a new alert is sent.

For instance :

A check like this :

{
    "checks": {
        "memory": {
            "command": "/etc/sensu/plugins/system/check-ram.py -w 10 -c 5",
            "handlers": ["mailer_clients"],
            "subscribers": ["system"],
            "occurrences": 5,
            "refresh": 86400,
            "interval": 60
        }
    }
}

And handlers :

{
    "handlers": {
    "mailer_clients": {
        "command": "/etc/sensu/handlers/mailer.rb -j mailer_clients", 
        "filters": [
            "occurrences"
        ], 
        "type": "pipe"
    }
  }
}

This check is executed every minutes. So if at 7:00 it goes into warning, at 7:05 first alert is sent. Then at 7:08, it goes into critical, at 7:09 into warning again, a second alert will be sent at 7:14. Apart from the fact that it can generates a lot of alerts, it's kind of confusing because we get two warning alerts in less than 20 minutes whereas refresh is set to a day.

Maybe it might be better to understand what's happening if critical alert was sent right away (when the state change from warning to critical because). Although we still might want to filter with occurrences if the state change directly from ok to critical.

Hope it's relevant for you :)

Evesy commented 6 years ago

Something to consider is if you have a handler that alerts on occurrences of 5, but only for critical severities.

If occurrences is not reset on a state change from warning to critical the below status history would trigger an alert, even though 5 critical's have not occurred [1, 1, 1, 1, 2]