Alignak-monitoring / alignak

Monitoring tool, highly flexible and new standard oriented
https://alignak-monitoring.github.io
GNU Affero General Public License v3.0
86 stars 19 forks source link

Item UNKNOWN (first initial)-status is not passed to event handler #846

Open spea1 opened 7 years ago

spea1 commented 7 years ago

Event handler must also get the UNKNOWN (first initial)-status if it does not change!

# -----------------------------------------------------------------
# grep localhost monitoring-logs.log | grep EVENT
# -----------------------------------------------------------------
[1496257510] INFO: SERVICE EVENT HANDLER: localhost;Cpu;OK;HARD;0;g_service_event_handler
[1496257531] INFO: SERVICE EVENT HANDLER: localhost;Memory;OK;HARD;0;g_service_event_handler
[1496257598] INFO: SERVICE EVENT HANDLER: localhost;Http;OK;HARD;0;g_service_event_handler
[1496257634] INFO: SERVICE EVENT HANDLER: localhost;Load;OK;HARD;0;g_service_event_handler
[1496257674] INFO: HOST EVENT HANDLER: localhost;UP;HARD;0;g_host_event_handler
[1496258276] ERROR: SERVICE EVENT HANDLER: localhost;Disks;CRITICAL;HARD;0;g_service_event_handler
# -----------------------------------------------------------------
# grep localhost monitoring-logs.log
# ----------------------------------------------------------------- 
[1496257472] WARNING: CURRENT HOST STATE: localhost;UNREACHABLE;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Disks;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Memory;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Disk /var;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Disk root;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Nrpe-status;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Http;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Disk /tmp;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Disk /usr;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Zombies;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;NetworkUsage;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Load;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Cpu;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Processus;UNKNOWN;HARD;0;
[1496257472] INFO: CURRENT SERVICE STATE: localhost;Users;UNKNOWN;HARD;0;
[1496257510] INFO: SERVICE EVENT HANDLER: localhost;Cpu;OK;HARD;0;g_service_event_handler
[1496257510] INFO: SERVICE ALERT: localhost;Cpu;OK;HARD;0;2 CPU, average load 33.5% < 80% : OK
[1496257531] INFO: SERVICE ALERT: localhost;Memory;OK;HARD;0;Ram : 35%, Swap : 0% : ; OK
[1496257531] INFO: SERVICE EVENT HANDLER: localhost;Memory;OK;HARD;0;g_service_event_handler
[1496257598] INFO: SERVICE ALERT: localhost;Http;OK;HARD;0;HTTP OK: HTTP/1.1 200 OK - 854 bytes in 0,002 second response time
[1496257598] INFO: SERVICE EVENT HANDLER: localhost;Http;OK;HARD;0;g_service_event_handler
[1496257634] INFO: SERVICE ALERT: localhost;Load;OK;HARD;0;Load (CPUs: 2) : 0.58 0.71 0.79 : OK
[1496257634] INFO: SERVICE EVENT HANDLER: localhost;Load;OK;HARD;0;g_service_event_handler
[1496257674] INFO: HOST ALERT: localhost;UP;HARD;0;PING OK - Packet loss = 0%, RTA = 0.08 ms
[1496257674] INFO: HOST EVENT HANDLER: localhost;UP;HARD;0;g_host_event_handler
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Users;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Disk /usr;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Load;OK;HARD;1;Load (CPUs: 2) : 0.58 0.71 0.79 : OK
[1496257776] INFO: CURRENT SERVICE STATE: localhost;NetworkUsage;UNKNOWN;HARD;0;ERROR : Unknown interface eth\d+
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Disk /var;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Processus;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Disks;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Cpu;OK;HARD;1;2 CPU, average load 33.5% < 80% : OK
[1496257776] INFO: CURRENT HOST STATE: localhost;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.08 ms
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Disk root;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Zombies;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Nrpe-status;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Disk /tmp;UNKNOWN;HARD;0;
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Memory;OK;HARD;1;Ram : 35%, Swap : 0% : ; OK
[1496257776] INFO: CURRENT SERVICE STATE: localhost;Http;OK;HARD;1;HTTP OK: HTTP/1.1 200 OK - 854 bytes in 0,002 second response time
[1496258276] ERROR: SERVICE EVENT HANDLER: localhost;Disks;CRITICAL;HARD;0;g_service_event_handler
[1496258276] ERROR: SERVICE NOTIFICATION: imported_admin;localhost;Disks;CRITICAL;detailled-service-by-email;CRITICAL : (>95%) Cached memory: 100%used(656MB/656MB) Shared memory: 100%used(35MB/35MB)
[1496258276] ERROR: SERVICE NOTIFICATION: guest;localhost;Disks;CRITICAL;detailled-service-by-email;CRITICAL : (>95%) Cached memory: 100%used(656MB/656MB) Shared memory: 100%used(35MB/35MB)
[1496258276] ERROR: SERVICE ALERT: localhost;Disks;CRITICAL;HARD;0;CRITICAL : (>95%) Cached memory: 100%used(656MB/656MB) Shared memory: 100%used(35MB/35MB)
[1496258380] INFO: CURRENT SERVICE STATE: localhost;Disk /usr;UNKNOWN;HARD;0;CHECK_NRPE: Error receiving data from daemon.
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Zombies;UNKNOWN;HARD;0;CHECK_NRPE: Error receiving data from daemon.
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Load;OK;HARD;1;Load (CPUs: 2) : 0.19 0.53 0.70 : OK
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Cpu;OK;HARD;1;2 CPU, average load 11.5% < 80% : OK
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Users;UNKNOWN;HARD;0;CHECK_NRPE: Error receiving data from daemon.
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Disk /var;UNKNOWN;HARD;0;CHECK_NRPE: Error receiving data from daemon.
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Memory;OK;HARD;1;Ram : 36%, Swap : 0% : ; OK
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Disk root;UNKNOWN;HARD;0;CHECK_NRPE: Error receiving data from daemon.
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Processus;UNKNOWN;HARD;0;CHECK_NRPE: Error receiving data from daemon.
[1496258381] ERROR: CURRENT SERVICE STATE: localhost;Disks;CRITICAL;HARD;0;CRITICAL : (>95%) Cached memory: 100%used(656MB/656MB) Shared memory: 100%used(35MB/35MB)
[1496258381] INFO: CURRENT SERVICE STATE: localhost;NetworkUsage;UNKNOWN;HARD;0;ERROR : Unknown interface eth\d+
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Nrpe-status;UNKNOWN;HARD;0;CHECK_NRPE: Error receiving data from daemon.
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Http;OK;HARD;1;HTTP OK: HTTP/1.1 200 OK - 854 bytes in 0,001 second response time
[1496258381] INFO: CURRENT SERVICE STATE: localhost;Disk /tmp;UNKNOWN;HARD;0;CHECK_NRPE: Error receiving data from daemon.
[1496258381] INFO: CURRENT HOST STATE: localhost;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.09 ms
# -----------------------------------------------------------------
# /usr/local/etc/alignak/alignak.cfg
# -----------------------------------------------------------------
...
# Eventhandler
cfg_dir=arbiter/objects/eventhandler
...
# Event handlers configuration
# ---
# Event handlers are enabled/disabled
enable_event_handlers=1

# By default don't launch even handlers during downtime. Put 0 to
# get back the default nagios behavior
no_event_handlers_during_downtimes=0

# Global host/service event handlers
global_host_event_handler=g_host_event_handler
global_service_event_handler=g_service_event_handler

# After a timeout, launched plugins are killed
event_handler_timeout=30
...
# Environment macros configuration
# ---
# Disabling environment macros is good for performance. If you really
enable_environment_macros=1
...

# -----------------------------------------------------------------
# /usr/local/etc/alignak/arbiter/templates/generic-host.cfg  
# -----------------------------------------------------------------
...
event_handler_enabled   1
...

# -----------------------------------------------------------------
# /usr/local/etc/alignak/arbiter/templates/generic-service.cfg
# -----------------------------------------------------------------
event_handler_enabled           1                       ; Service event handler is enabled

# -----------------------------------------------------------------
/usr/local/etc/alignak/arbiter/objects/eventhandler/eventhandler.cfg 
# -----------------------------------------------------------------
define command{
       command_name g_host_event_handler
       command_line /usr/local/var/libexec/alignak/eventhandler.sh "ALIGNAK-EVENT HOST hHOSTNAME $HOSTNAME$ hHOSTADDRESS $HOSTADDRESS$ hSTATE $HOSTSTATE$ hEVENTID $HOSTEVENTID$ hSTATEID $HOSTSTATEID$ hSTATETYPE $HOSTSTATETYPE$ hATTEMPT $HOSTATTEMPT$ hTIMET $TIMET$ hOUTPUT $HOSTOUTPUT$"
       }

define command{
       command_name g_service_event_handler
       command_line /usr/local/var/libexec/alignak/eventhandler.sh "ALIGNAK-EVENT SERVICE sHOSTNAME $HOSTNAME$ sHOSTADDRESS $HOSTADDRESS$ sDESC $SERVICEDESC$ sSTATEID $SERVICESTATEID$ sSTATE $SERVICESTATE$ sEVENTID $SERVICEEVENTID$ sSTATETYPE $SERVICESTATETYPE$ sATTEMPT $SERVICEATTEMPT$ sTIMET $TIMET$ sOUTPUT $SERVICEOUTPUT$"
       }

# -----------------------------------------------------------------
# /usr/local/var/libexec/alignak/eventhandler.sh
# -----------------------------------------------------------------
#!/bin/sh
LOD_DIR=/usr/local/var/log/alignak
echo "# -----------------------------------------------------------------" >> $LOD_DIR/eventhandler.log
echo "# $1" >> $LOD_DIR/eventhandler.log
mohierf commented 7 years ago

Is the initial state really interesting for the event handlers ? @spea1 please explain what it should be used for?

spea1 commented 7 years ago

I would like to use it to forward host and service Status.

example: Then I can react from the outside for example ack

mohierf commented 7 years ago

The initial state defined in the configuration is only used if no other state is known for an host/service. Indeed, it is used whan the very first check has not yet been executed ... sure it is what you need?