Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
1.98k stars 571 forks source link

Downtimes scheduled via API (sometimes) get synced/re-created with delay and are doubled #10078

Open log1-c opened 3 weeks ago

log1-c commented 3 weeks ago

Hello team,

sorry for the confusing title. Hopefully the problem becomes clear with the following description. I've attached the log outputs & IDO history as I got a body too long error from github

We have Windows sessions hosts that start and stop based on session demand by the customer (scaling). When a host is turned off it creates a downtime for itself and all services via API (currently primarily to the secondary master). Once the host is turned on again a script triggers rechecks of all checks (to make sure they are all OK and don't notify on an old state) and then removes the downtime.

Now it sometimes something strange happens that I'll try to explain with the following logs. Downtime was created at 2024-06-07 20:03, but does not turn up in the web interface history, though it is present in the database. BUT it does have a different actual_start_time instead of the scheduled_start_time.

On host start the downtime get's deleted (2024-06-09 23:01) but then immediately recreated again (without another API call) [see Log from secondary master on 2024-06-09 at DOWNTIME actual_start_time]

The host has a total of 30 services, but the IDO lists 31+31 downtimes, so double the amount for the same scheduled_start_time

downtime-issue_log_history.md

Your Environment

Additional context

We talked about this problem with Eric during Icinga Summit. He mentioned the active-stage files/folders under /var/lib/icinga2/api/packages, so I'll add those here as well. But I didn't spot anything "out of line".

config master

# cd /var/lib/icinga2/api/packages/
[root@msp-ic-ma02 packages]# ls -la
total 0
drwx------. 4 icinga icinga  34 Oct 17  2023 .
drwxr-x---. 6 icinga icinga 155 Jun 10 09:21 ..
drwx------. 3 icinga icinga 109 Nov 10  2022 _api
drwx------. 3 icinga icinga 109 Jun 10 09:22 director

# ls -la _api/
total 16
drwx------. 3 icinga icinga  109 Nov 10  2022 .
drwx------. 4 icinga icinga   34 Oct 17  2023 ..
-rw-r--r--. 1 icinga icinga  453 Nov 10  2022 active.conf
-rw-r--r--. 1 icinga icinga   36 Nov 10  2022 active-stage
drwx------. 4 icinga icinga 4096 Nov 10  2022 e0b939bd-8e46-45a2-bacd-0609f639d271
-rw-r--r--. 1 icinga icinga   25 Nov 10  2022 include.conf
[root@msp-ic-ma02 packages]# cat _api/active-stage
e0b939bd-8e46-45a2-bacd-0609f639d271[root@msp-ic-ma02 packages]# cat _api/active.conf
if (!globals.contains("ActiveStages")) {
  globals.ActiveStages = {}
}

if (globals.contains("ActiveStageOverride")) {
  var arr = ActiveStageOverride.split(":")
  if (arr[0] == "_api") {
    if (arr.len() < 2) {
      log(LogCritical, "Config", "Invalid value for ActiveStageOverride")
    } else {
      ActiveStages["_api"] = arr[1]
    }
  }
}

if (!ActiveStages.contains("_api")) {
  ActiveStages["_api"] = "e0b939bd-8e46-45a2-bacd-0609f639d271"
}

# cat _api/include.conf
include "*/include.conf"

secondary master

# cd /var/lib/icinga2/api/packages/
[root@msp-ic-ma01 packages]# ls -la
total 0
drwx------. 3 icinga icinga  18 Jun 23  2023 .
drwxr-x---. 6 icinga icinga 155 Jun 10 09:21 ..
drwx------. 3 icinga icinga 109 Jun 23  2023 _api

# ls -la _api/
total 12
drwx------. 3 icinga icinga 109 Jun 23  2023 .
drwx------. 3 icinga icinga  18 Jun 23  2023 ..
drwx------. 4 icinga icinga  55 Jun 23  2023 8de75f7d-91bc-4599-8f75-51a4fb95586d
-rw-r--r--. 1 icinga icinga 453 Jun 23  2023 active.conf
-rw-r--r--. 1 icinga icinga  36 Jun 23  2023 active-stage
-rw-r--r--. 1 icinga icinga  25 Jun 23  2023 include.conf
[root@msp-ic-ma01 packages]# cat _api/active-stage
8de75f7d-91bc-4599-8f75-51a4fb95586d[root@msp-ic-ma01 packages]# cat _api/active.conf
if (!globals.contains("ActiveStages")) {
  globals.ActiveStages = {}
}

if (globals.contains("ActiveStageOverride")) {
  var arr = ActiveStageOverride.split(":")
  if (arr[0] == "_api") {
    if (arr.len() < 2) {
      log(LogCritical, "Config", "Invalid value for ActiveStageOverride")
    } else {
      ActiveStages["_api"] = arr[1]
    }
  }
}

if (!ActiveStages.contains("_api")) {
  ActiveStages["_api"] = "8de75f7d-91bc-4599-8f75-51a4fb95586d"
}

# cat _api/include.conf
include "*/include.conf"

customer satellite

# cd /var/lib/icinga2/api/packages/
[root@orgad-mgmt02 packages]# ls -la
total 0
drwx------. 3 icinga icinga  18 Dec 20  2022 .
drwxr-x---. 6 icinga icinga 122 Jun 10 09:21 ..
drwx------. 3 icinga icinga 109 Dec 20  2022 _api

# ls -la _api/
total 12
drwx------. 3 icinga icinga 109 Dec 20  2022 .
drwx------. 3 icinga icinga  18 Dec 20  2022 ..
-rw-r--r--. 1 icinga icinga 453 Dec 20  2022 active.conf
-rw-r--r--. 1 icinga icinga  36 Dec 20  2022 active-stage
drwx------. 4 icinga icinga  55 Dec 20  2022 fc9095a4-6b5c-4d8c-96b5-d21682c1e4fd
-rw-r--r--. 1 icinga icinga  25 Dec 20  2022 include.conf
[root@orgad-mgmt02 packages]# cat _api/active-stage
fc9095a4-6b5c-4d8c-96b5-d21682c1e4fd[root@orgad-mgmt02 packages]# cat _api/active.conf
if (!globals.contains("ActiveStages")) {
  globals.ActiveStages = {}
}

if (globals.contains("ActiveStageOverride")) {
  var arr = ActiveStageOverride.split(":")
  if (arr[0] == "_api") {
    if (arr.len() < 2) {
      log(LogCritical, "Config", "Invalid value for ActiveStageOverride")
    } else {
      ActiveStages["_api"] = arr[1]
    }
  }
}

if (!ActiveStages.contains("_api")) {
  ActiveStages["_api"] = "fc9095a4-6b5c-4d8c-96b5-d21682c1e4fd"
}

# cat _api/include.conf
include "*/include.conf"