linki / chaoskube

chaoskube periodically kills random pods in your Kubernetes cluster.
MIT License
1.81k stars 120 forks source link

Adding slack notifications #101

Open shaikatz opened 6 years ago

shaikatz commented 6 years ago

Hi,

I'm looking for a way to notify my team every-time the chaos bot started to perform actions. As Slack usage is widely used, that will be my preference.

I want to start and implement that capability for chaoskube.

Any thoughts?

shaikatz commented 6 years ago

I'll add that with the current codebase there is no way to know that we've "entered" a chaos period (a period with no exclusions) or "exit" (a period that excluded). The only way to understand that as I see it, is to check if the previous interval was in exclusion time.

linki commented 6 years ago

Could we add a prometheus metric for that?

I often saw some "binary" metrics that you are supposed to "query" via labels and then either return 0 or 1, e.g.

up{application="dnsmasq-node", ...} 1
up{instance="ip-172-31-8-124.eu-central-1.compute.internal", ...} 1
...

We could expose something similar for chaoskube, e.g.

chaoskube_active 1 # or 0

It's not comparable to a notification but at least you would have a structured way to find out whether chaoskube is in an active period right now.

shaikatz commented 6 years ago

We can definitely add a prometheus metric for that, but that would serve us for alerts and monitoring.

I would still like to make a slack notification for that, so the team will be actively aware when a chaos is running, and be prepared if action is required.

Do you prefer to avoid adding slack capabilities to this tool?

linki commented 6 years ago

Not at all. I just wanted to provide an inferior alternative that might save us some effort.

I'm fine with adding slack support. How about putting it in a separate package and hiding it behind a nice Notifer interface?

Let me know how I can help. I'm looking forward to seeing what you come up with.

shaikatz commented 6 years ago

Great, I'll try to work on that soon and I hope to return with a PR 👍

palmerabollo commented 6 years ago

I often use alertmanager to send notifications to different channels, including Slack. With this approach, a Prometheus metric is enough and there is a clear separation of responsibilities.