ndelitski / rancher-alarms

Will kick your ass if found unhealthy service in Rancher environment
85 stars 20 forks source link

feature request - slack plugin #5

Closed ozbillwang closed 8 years ago

ozbillwang commented 8 years ago

I'd like to ask feature to send alarms to slack incoming webhook

Second, it simplifies the setting and reduce into one parameter.

from

    ALARM_EMAIL_ADDRESSES: arya@stark.com,john@snow.com
    ALARM_EMAIL_USER: alarm@nightwatch.com
    ALARM_EMAIL_PASS: nightWatch
    ALARM_EMAIL_SMTP_HOST: smtp.snow.com
    ALARM_EMAIL_FROM: 'Alarm of a Night Watch <alarm@nightwatch.com>'

to

slack_webhook: https://hooks.slack.com/services/<KEY>

Third, I needn't set SMTP server at all, which need extra resource involved.

ozbillwang commented 8 years ago

the package slack-node is exist, let me see if I can help directly

https://www.npmjs.com/package/slack-node

ndelitski commented 8 years ago

hi @SydOps! will work on these features soon!

flaccid commented 8 years ago

Taking up a ticket in our sprint which needs this feature. I will likely just submit a PR. Starting work on it now..

ndelitski commented 8 years ago

@flaccid please wait i'll push slack integration in an hour and we could test then

flaccid commented 8 years ago

@ndelitski ok no worries!

ndelitski commented 8 years ago

pushed to master, please check with simple config in a config.json

    "notifications": {
        "*": {
            "targets": {
                "email": {
                    "recipients": ["ndelitski@gmail.com"],
                    "templateFile": "/tmp/alarm.html"
                },
                "slack": {
                    "channel": "#override-channel"
                }
            },
            "healthcheck": {
               ...
            }
        }
    },
"targets: {
     "email": {
            "smtp": {
             ...
            }
        },
        "slack": {
            "webhookUrl": "https://hooks.slack.com/services/YOUR_SLACK_UUID,
            "botName": "rancher-alarm",
            "channel": "#devops"
        }
}

or through environment variables:

ALARM_SLACK_WEBHOOK_URL
ALARM_SLACK_CHANNEL
ALARM_SLACK_BOTNAME

Also did email and slack message templates, so you can send your own letters and messages now. More complicated examples will be shown in main readme file, i'll do it a bit later.

ozbillwang commented 8 years ago

@ndelitski

That's nice job. Let me test it.

Created a new slack team https://rancher-alarms.slack.com for testing.

ALARM_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/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
ALARM_SLACK_CHANNEL=#alarms
ALARM_SLACK_BOTNAME=rancher-alarms 
flaccid commented 8 years ago

Feedback 1 : Possible to get a var for the icon_emoji ?

ozbillwang commented 8 years ago

@flaccid

Did you get alarms?

I start the container successful, but no alarms got, check the container log, I found the channel name is missed. But I did provide the channel name when start the stack.

started with config:

"targets": {
       "slack": {
            "webhookUrl": "https://hooks.slack.com/services/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",
            "botName": "rancher-alarms"
        }

If I docker exec into the container, run env, I can't see channel name as well.

When add the stack from catalog, I give the channel name to #alarms , maybe # is not allow in rancher.

ndelitski commented 8 years ago

@SydOps forgot to say I didnt upload new image to a docker hub yet, so you need to build with dockerfile or test from sources. Checked ALARM_SLACK_CHANNEL env variable and it passed to alarms config correctly

ndelitski commented 8 years ago

@flaccid sure! we should send more emotional alarms!

ozbillwang commented 8 years ago

@ndelitski

I built image by myself. Seem there are some typo in my rancher-compose.yml file, fixing it.

ndelitski commented 8 years ago

@SydOps also to see more details on whats going on you can set env LOG_LEVEL=trace|info|error

ozbillwang commented 8 years ago

@ndelitski

Can you push the new image to hub.docker.com, but mark it with tag: 0.2.0? but don't link it to latest?

ndelitski commented 8 years ago

yeah but a little bit later, have some troubles with internet channel now

ndelitski commented 8 years ago

you can send custom messages by specifying ALARM_SLACK_TEMPLATE="service <#{serviceUrl}|#{serviceName}/#{stackName}> become #{monitorState}". As you see variables injected with a #{variable_name} syntax

ndelitski commented 8 years ago

here is a list of variables you can use in templates:

target.notify({
  state, // rancher service state
  monitorState: newState, // rancher-alarms service state - always degraded
  serviceName,
  serviceUrl, // url to a running service in a rancher UI
  stackUrl, // url to stack in a rancher UI
  stack, // stack object with a full list of properties (see Rancher API)
  stackName: stack.name,
  service: this.service // service object with a full list of properties (see Rancher API)
})

But you can't use nested props in a template for now, it is still in development, i mean #{service.id} is not currently possible

ndelitski commented 8 years ago

template can be loaded from a file mounted into a container (valuable if you wanna provide rich formatting with a html notifications), just use ALARM_EMAIL_TEMPLATE_FILE=/etc/rancher-alarms/email-template.html and docker run with -v path-to/alarm-templates:/etc/rancher/alarms

ozbillwang commented 8 years ago

I fixed all problems, and start the container properly.

But still can't get alarms, check the container log, I can see the service running fine, but no new log after that.

[INFO]   2016-8-11 10:24:47:249    rancher-alarms/rancher-alarms:
  targets: "[SlackTarget]"
  healthcheck: {
    "pollInterval": 15000,
  "healthyThreshold": 3,
 "unhealthyThreshold": 4
 }
[INFO]   2016-8-11 10:24:47:253    start polling rancher-alarms/rancher-alarms

I have started several containers and stop/start several times.

So how can I trigger an alarms?

Its rancher catalog is here:

https://github.com/BWITS/rancher-catalog/tree/master/templates/rancher-alarms/0

flaccid commented 8 years ago

@SydOps stop and start actions are valid. Degraded/Unhealthy isn't - you basically want to make to emulate that. " A service is in the yellow (or “degraded”) state if Rancher has detected that at least one of the containers is either in the red state or in the process of returning it to a green state." http://docs.rancher.com/rancher/v1.0/zh/rancher-services/health-checks/ A container may be exiting, so cattle spins up a new one, for example.

flaccid commented 8 years ago

@ndelitski Feedback 2: looking at your https://github.com/ndelitski/rancher-alarms/issues/5#issuecomment-239113419 (thanks btw) is it not possible to use the rancher environment/project name? We would like to create a link to to the env/project in the message.

ndelitski commented 8 years ago

@flaccid extended template data with an environment/project variables, see https://github.com/ndelitski/rancher-alarms/commit/501adb842f701cd6609b047b6a26afbcccbc4673

flaccid commented 8 years ago

Considering

    this._textTemplate = textTemplate || EMAIL_TEMPLATE;
    this._htmlTemplate = template || htmlTemplate;

When we mainly just want to do html template and we use env var only, do we just set EMAIL_TEMPLATE with the HTML template? @JayHaoDing isn't sure on how to do this or if its possible yet? Our use case here is send a email where the html template is specified by env var.

ndelitski commented 8 years ago

In your code snippet EMAIL_TEMPLATE is a variable with default text template for cases when you didnt provide custom one in a config. You can define custom HTML template in two ways using env variables:

ndelitski commented 8 years ago

simple html template could look like:

<html>
<head>
    <title></title>
</head>
<body>
    #{serviceName} become #{monitorState}
</body>
</html>
flaccid commented 8 years ago

@ndelitski no worries, thanks. @JayHaoDing will re-test this in our AU morning tomorrow and report back result.

ndelitski commented 8 years ago

@flaccid nice! If all goes well we could do a new release with a latest tag then. we could even try to add our project to an official rancher catalog, what do you think? @SydOps have you succeeded in deploying service with a private rancher catalog?

flaccid commented 8 years ago

@ndelitski yep no problem, we'll just make sure in the next day or so that all our little bits here are solved (pretty much almost there, I raised a couple of minor pull requests too). Happy to do the catalog entry for the community catalog. I can work with Rancher Labs then too if you like to see if its possible to get it in the official catalog.

ozbillwang commented 8 years ago

@ndelitski

I have pushed the image I built to bwits/rancher-alarms as temp solution to test rancher catalog.

ozbillwang commented 8 years ago
screen shot 2016-08-12 at 11 44 51 am

Finally I emulated an unhealthy container, and got the alarms. 👍

It is already good enough for me.

ozbillwang commented 8 years ago

@flaccid

Quote: I can work with Rancher Labs then too if you like to see if its possible to get it in the official catalog.

@JayHaoDing and me did rancher-catalog for both email and slack notification. Could you help to add them to official community-catalog?

https://github.com/JayHaoDing/rancher-catalog/tree/master/templates/rancher-alarms https://github.com/BWITS/rancher-catalog/tree/master/templates/rancher-alarms

I am thinking to have two catalogs:

rancher-alarms-email
rancher-alarms-slack

Merge the catalog codes to ndelitski/rancher-alarms directly or create a new organization to include all of us for that repository, such as: rancher-lovers/rancher-catalog, then merge the PR to rancher/community-catalog. I have created the organization (https://github.com/rancher-lovers) and invite all of us already.

Your options, please.

flaccid commented 8 years ago

I'd prefer one catalog entry and once it is in the community repo contrib only needs to be done by pull request. I am a Rancher lover but don't need an organisation for it 😉

flaccid commented 8 years ago

@SydOps as this issue is closed, I recommend creating new issue(s) for your last comment.