use slack to manage alarms in Icinga2
It can be used to interact with Icinga2 from your Slack client. It uses the Icinga2 API to get Host/Service status details. Simple status filters can be used to narrow down the returned status list.
on RedHat/CentOS you need to install python3.6 and virtualenv from EPEL first
yum install python36-virtualenv
setting up the virtual env would be done like this
virtualenv-3.6 .pyenv
instead of virtualenv .pyenv
here we assume we install the bot in /opt
cd /opt
git clone https://github.com/bb-Ricardo/icinga-slack-bot.git
cd icinga-slack-bot
virtualenv .pyenv
. .pyenv/bin/activate
pip install -r requirements.txt
Now you would be able to start the bot with
.pyenv/bin/python3 icinga-bot.py
Most likely the start will fail as the config is not fully set up.
It is recommended to create your own config
cp icinga-bot.ini.sample icinga-bot.ini
Change config options according your environment. After you entered the Slack tokens you should be able to start the bot.
sudo cp icinga-slack-bot.service /etc/systemd/system
sudo systemctl daemon-reload
sudo systemctl start icinga-slack-bot
sudo systemctl enable icinga-slack-bot
git clone https://github.com/bb-Ricardo/icinga-slack-bot.git
cd icinga-slack-bot
docker build -t icinga-bot .
Copy the config from the example to icinga-bot.ini
and edit
the settings.
Now you should be able to run the image with following command
docker run -d -v /PATH/TO/icinga-bot.ini:/app/icinga-bot.ini --name bot icinga-bot
This would be an Icinga Slack bot API user
# vim /etc/icinga2/conf.d/api-users.conf
object ApiUser "icinga-bot" {
password = "icinga"
permissions = [ "objects/query/*", "objects/modify/*", "status/query", "actions/*" ]
}
For further details check the Icinga2 API documentation
You need to install a Classic Slack Bot app in order to use RTM.
You can also use this icon to represent the bot in Slack properly.
icinga-slack-bot comes with a default config file
usage: icinga-bot.py [-h] [-c icinga-bot.ini] [-l {DEBUG,INFO,WARNING,ERROR}]
[-d]
This is an Icinga2 Slack bot.
It can be used to interact with Icinga2 from your Slack client. It uses the
Icinga2 API to get Host/Service status details. Simple status filters can be
used to narrow down the returned status list.
Version: 1.0.0 (2020-05-17)
optional arguments:
-h, --help show this help message and exit
-c icinga-bot.ini, --config icinga-bot.ini
points to the config file to read config data from
which is not installed under the default path
'./icinga-bot.ini'
-l {DEBUG,INFO,WARNING,ERROR}, --log_level {DEBUG,INFO,WARNING,ERROR}
set log level (overrides config)
-d, --daemon define if the script is run as a systemd daemon
Following commands are currently implemented:
display the bot help
answers simply with pong if slack bot is running
request a host status (or short "hs") of any or all hosts
request a service status (or short "ss") of any or all services
display a summary of current host and service status numbers
acknowledge problematic hosts or services
set a downtime for hosts/services
add a comment to hosts/services
reschedule a host/service check
send a costum host/service notification
delay host/service notifications
show a comment/downtime/acknowledgement
remove a comment/downtime/acknowledgement
abort current action (ack/dt/ena/disa/sh/rm)
print current Icinga status details
enable an action
disable an action
Each command also provides a detailed help help <command>
Following command filters are implemented
Command filter can be combined like "warn crit" which would return all services in WARNING and CRITICAL state. Also "problems" could be used to query all service with a certain name match in NOT OK sate.
Important:
You can add host names or services names to any status command. Also just parts of host and service names can be used to search for objects
Important:
ss crit test web
will be converted into a filter like:
(service.state == 2) && ( match("*test*", host.name) && match("*web*", service.name) ) || ( match("*web*", host.name) && match("*test*", service.name) )
HINT:
You can also use quotes in your filter to find services containing spaces like:
ss server "network interfaces"
Sometime you try to tackle a problem and get tired of entering the filter every time. In this case you can use !!
which defaults
to last commands filter.
Example:
You are checking the service status with ss myserver ntp
. And then you want to reschedule this service.
Instead of adding the filter again you can write rs !!
which translates into rs myserver ntp
.
This works for all commands using filters.
Actions have been added to perform certain actions on hosts or services.
Current actions are Acknowledgements
and Downtimes
.
This command will start a dialog to set an acknowledgement for an unhandled service or host. This can be started with this command and the bot will ask questions about the details on following order:
tomorrow 3pm, friday noon or monday morning
Or more specific like january 2nd or even more specific like 29.02.2020 13:00.
Just try and see what works best for you.
At the end the bot will ask you for a confirmation which can be answered with yes
or just y
or no
.
After that the bot will report if the action was successful or not.
It's also possible to short cut the whole Q/A and just issue the action in one command:
ack myserver ntp until tomorrow evening Wrong ntp config, needs update
This will acknowledge a problematic service ntp on myserver until 6pm the following day.
ack <host> <service> until <time> <comment>
or
ack <host> until <time> <comment>
or
ack <service> until <time> <comment>
This command works pretty similar to the acknowledgement command except that the bot will ask for a downtime start. Here it's also possible to use a relative time format.
It's also possible to short cut the whole Q/A and just issue the action in one command:
dt myserver ntp from now until tomorrow evening NTP update
This will set a downtime for the service ntp on myserver until 6pm the following day.
dt <host> <service> from <time> until <time> <comment>
or
dt <host> from <time> until <time> <comment>
or
dt <service> from <time> until <time> <comment>
hs down test
will display all hosts in DOWN state which match "test" as host name like "testserver" or "devtest"
hs all
will return all hosts and their status
hs
will display all hosts which currently have a problem
ss warn crit ntp
will display all services which match "ntp" and are in state CRITICAL or WARNING
ss
will display all services which currently have a problem
ack myserver ntp until tomorrow evening Wrong ntp config, will be updated tomorrow
will acknowledge a problematic service ntp on myserver until 6pm the following day
Important:
To get Slack notifications if something goes wrong you can check out the notification handlers in contrib.
Copy slack-notification.sh to /etc/icinga2/scripts/
and add the icinga2_slack_notification_commands.conf
to your icinga configuration (depends on your where you need to put this)
More information here: Icinga2 Docs -> notifications.
Don't forget to set NOTIFICATION_CONFIG
in
icinga2_slack_notification_commands.conf to the full path to your bot config file.
In order to send alarms to Slack you need a Webhook URL.
You can check out the full license here
This project is licensed under the terms of the MIT license.
quite some inspiration came from mlabouardy and his Go implementation of a slack bot