AcalephStorage / consul-alerts

A simple daemon to send notifications based on Consul health checks
GNU General Public License v2.0
826 stars 191 forks source link

Blacklist is not stopping reminders #198

Open djenriquez opened 7 years ago

djenriquez commented 7 years ago

Hello,

I have a certain health check that I want to mute/blacklist. The check i want to mute is on the host 'ip-10-2-9-82' and according to the docs, I should be able to blacklist checks from this host completely.

Here are slack notifications from consul-alerts: screen shot 2017-09-14 at 1 58 05 pm

They are repeating every 10 minutes as expected.

At 1:55 PM, after the second alert showed up in slack, I added a key to consul-alerts/config/checks/blacklist/nodes/ip-10-2-9-82 like so: screen shot 2017-09-14 at 1 57 27 pm

I expect that this would be the last alert I get but... screen shot 2017-09-14 at 2 18 30 pm ...it continues on.

Did I misunderstand whats written in the README here ?

Here is the healthcheck as defined in Consul at the /health/node endpoint:

[{
    "Node": "ip-10-2-9-82",
    "CheckID": "flows-operational",
    "Name": "Flows Operational",
    "Status": "warning",
    "Notes": "",
    "Output": "Traceback (most recent call last):\n  File \"/etc/consul/health-checks/nifi-flow.py\", line 55, in check_nifi\n    raise NifiStoppedFlowException(controller_status['stoppedCount'])\nNifiStoppedFlowException: 3 stopped flows detected\n",
    "ServiceID": "nifi",
    "ServiceName": "nifi",
    "ServiceTags": ["reporting"],
    "CreateIndex": 3850794,
    "ModifyIndex": 3851102
}, {
    "Node": "ip-10-2-9-82",
    "CheckID": "node-connected",
    "Name": "Node Connected",
    "Status": "passing",
    "Notes": "",
    "Output": "All nodes connected\n",
    "ServiceID": "nifi",
    "ServiceName": "nifi",
    "ServiceTags": ["reporting"],
    "CreateIndex": 3850794,
    "ModifyIndex": 3851103
}, {
    "Node": "ip-10-2-9-82",
    "CheckID": "serfHealth",
    "Name": "Serf Health Status",
    "Status": "passing",
    "Notes": "",
    "Output": "Agent alive and reachable",
    "ServiceID": "",
    "ServiceName": "",
    "ServiceTags": [],
    "CreateIndex": 3850793,
    "ModifyIndex": 3850793
}]

Thanks in advance!

djenriquez commented 7 years ago

Also:

/ # consul-alerts --version
Consul Alerts 0.5.0
Gerrrr commented 7 years ago

Hey @djenriquez

What Consul version do you use?

djenriquez commented 7 years ago

@Gerrrr its a mix for clients and servers actually, mostly 0.9.0. The servers running consul-alerts are runnining 2 0.9.2, and 1 server 0.9.0.

Gerrrr commented 7 years ago

Thanks for the info! I will take a look at the issue over the weekend.