Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
1.99k stars 571 forks source link

receiving outdated notification information #8762

Open davixd opened 3 years ago

davixd commented 3 years ago

Describe the bug

As a notification reminder will be generated out of the notification timeperiod, the notification reminder should check the last service state of the check, when the notificiation period gets active again. Otherwise it leads to outdated notification informations.

To Reproduce

timeperiods.conf:

object TimePeriod "admin-sms-receive" {
  display_name = "Check Period for admin sms received"
  excludes = [ "feiertage-de-fix", "feiertage-de-dynamic" ]
  ranges = {
    "monday"    = "12:00-12:01"
    "tuesday"   = "12:00-12:01"
    "wednesday" = "12:00-12:01"
    "thursday"  = "12:00-12:01"
    "friday"    = "12:00-12:01"
  }
}

object TimePeriod "admin-sms-receive-check-time" {
  display_name = "Check Period for admin sms received"
  ranges = {
    "monday"    = "12:00-12:02:10"
    "tuesday"   = "12:00-12:02:10"
    "wednesday" = "12:00-12:02:10"
    "thursday"  = "12:00-12:02:10"
    "friday"    = "12:00-12:02:10"
  }
}

services.conf:

object Service "check_admin_sms" {
import "notification-sms-bereitschaft-service"

check_command = "check_admin_sms"
vars.notification.sms_custom_period = "admin-sms-receive"

  max_check_attempts = 1
  check_interval = 1s
  retry_interval = 2m
  check_period = "admin-sms-receive-check-time"
  enable_perfdata = false

host_name = "dummy.admin-sms-prod"
}

templates_notification_services.conf:

template Service "notification-sms-bereitschaft-service" {
enable_flapping = true
vars.notification["sms_contacts"].users += [ "xxx-admin-sms" ]
}

apply_notification_services.conf:

apply Notification "sms-service" to Service {
import "sms-service-notification-template"

users += service.vars.notification.sms_contacts.users

    if (service.vars.notification.sms_custom_period) {
        period = service.vars.notification.sms_custom_period
    }

assign where service.vars.notification.sms_contacts.users
}

templates_notifications.conf:

template Notification "sms-service-notification-template" {
  command = "sms-service-notification"

  states = [ OK, Critical ]
  types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart, FlappingEnd  ]

  period = "24x7"
  interval = 24h
}

commands.conf:

object CheckCommand "check_admin_sms" {

        command = [ PluginDir + "/check_admin_sms" ]
}

/usr/lib64/nagios/plugins/check_admin_sms:

#!/bin/bash
# Version 1.0

date=$(date +"%H%M")

if [[ $date == 1200 || $date == 1201 ]]; then

echo "Check if sms PROD Icinga2 received on admin handy"
exit 2

else

echo "OK: Its not 12:00 p.m. yet! There is no need for a admin check SMS."
exit 0

fi

Expected behavior

As the icinga2 service check change its state out of the notification time period at 12:02 p.m. to OK, the reminder notification should check the last state of the icinga2 service check as the notification timeperiod gets active again, if its still equal with the reminder notification. So as the icinga2 service check gets critical at 12:00 p.m. again, the reminder notification from out of the notification timeperiod from the past (with state OK), should be dropped. Otherwise it will lead to outdated notification informations.

Screenshots

image

Your Environment

rpm -qa |grep release
centos-release-7-9.2009.1.el7.centos.x86_64
rpm -qa |grep icinga
icingaweb2-vendor-JShrink-2.8.2-1.el7.icinga.noarch
icingaweb2-2.8.2-1.el7.icinga.noarch
icingaweb2-common-2.8.2-1.el7.icinga.noarch
icingaweb2-vendor-zf1-2.8.2-1.el7.icinga.noarch
icingaweb2-vendor-dompdf-2.8.2-1.el7.icinga.noarch
php-Icinga-2.8.2-1.el7.icinga.noarch
icingacli-2.8.2-1.el7.icinga.noarch
icinga2-bin-2.12.3-1.el7.icinga.x86_64
icingaweb2-vendor-lessphp-2.8.2-1.el7.icinga.noarch
icingaweb2-vendor-HTMLPurifier-2.8.2-1.el7.icinga.noarch
icinga-l10n-1.0.0-1.el7.icinga.noarch
icingaweb2-vendor-Parsedown-2.8.2-1.el7.icinga.noarch
icinga2-common-2.12.3-1.el7.icinga.x86_64
icinga2-2.12.3-1.el7.icinga.x86_64
icinga2-ido-mysql-2.12.3-1.el7.icinga.x86_64
vim-icinga2-2.12.3-1.el7.icinga.x86_64
rpm -qa |grep php
rh-php73-runtime-1-1.el7.x86_64
rh-php73-php-ldap-7.3.20-1.el7.x86_64
php-5.4.16-48.el7.x86_64
rh-php73-php-mysqlnd-7.3.20-1.el7.x86_64
rh-php73-php-cli-7.3.20-1.el7.x86_64
php-cli-5.4.16-48.el7.x86_64
rh-php73-php-json-7.3.20-1.el7.x86_64
rh-php73-php-pgsql-7.3.20-1.el7.x86_64
rh-php73-php-gd-7.3.20-1.el7.x86_64
rh-php73-php-mbstring-7.3.20-1.el7.x86_64
php-common-5.4.16-48.el7.x86_64
rh-php73-php-common-7.3.20-1.el7.x86_64
rh-php73-php-fpm-7.3.20-1.el7.x86_64
rh-php73-php-intl-7.3.20-1.el7.x86_64
php-Icinga-2.8.2-1.el7.icinga.noarch
php-gd-5.4.16-48.el7.x86_64
rh-php73-php-pdo-7.3.20-1.el7.x86_64
icingaweb2-vendor-lessphp-2.8.2-1.el7.icinga.noarch
sclo-php73-php-pecl-imagick-3.4.4-3.el7.x86_64
rh-php73-php-zip-7.3.20-1.el7.x86_64
rh-php73-php-xml-7.3.20-1.el7.x86_64
rpm -qa |grep maria
mariadb-libs-5.5.68-1.el7.x86_64
mariadb-5.5.68-1.el7.x86_64
mariadb-server-5.5.68-1.el7.x86_64
rpm -qa |grep httpd
httpd-tools-2.4.6-97.el7.centos.x86_64
httpd-2.4.6-97.el7.centos.x86_64
icinga2 feature list
Disabled features: command compatlog debuglog elasticsearch gelf graphite influxdb livestatus opentsdb statusdata syslog
Enabled features: api checker ido-mysql mainlog notification perfdata
Al2Klimov commented 1 year ago

Note

It's likely the stashed notification feature.

davixd commented 1 year ago

To avoid miss leading notification informations: the stashed notification feature should check if there was a state change after and drop the stashed notification or at least to get sure that the notifications will be send out in chronologically order.

Al2Klimov commented 1 year ago

Note

It's likely the stashed notification feature.

No, seems like a race condition while firing a suppressed recovery notification the 1/2 second before it gets critical again.