Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2.03k stars 578 forks source link

icinga2 object list not iso on ha config #8489

Closed AurelienFo closed 3 years ago

AurelienFo commented 4 years ago

I'm using an icinga2 instance on HA (2 masters) and many zones. All is working fine but i've a strange probleme with "icinga2 object list" command.

On my master01: icinga2 object list | grep Object | wc -l 4710

On my master02: icinga2 object list | grep Object | wc -l 16593

What is strange is that master01 is "primary" Active Endpoint | master01 Active Icinga Web 2 Endpoint | master01

If I search on /var/lib/icinga2/api/packages/_api/API_KEY/conf.d/hosts on each masters i've the same number of hosts.

Environment: (same on both masters) icinga2 --version icinga2 - The Icinga 2 network monitoring daemon (version: r2.11.3-1)

icinga2 feature list Disabled features: command compatlog debuglog elasticsearch gelf graphite influxdb livestatus opentsdb statusdata syslog Enabled features: api checker ido-pgsql mainlog notification perfdata

icinga2 object list --type Zone | wc -l icinga2 object list --type Endpoint | wc -l returns the same numbers of elements

Only difference is that on my master01:

icinga2 object list --type Host | grep api

returns zero elements, as it doesnt search API hosts

/etc/icinga2/zones.conf is the same on both masters

Thanks for any help

Al2Klimov commented 4 years ago

Hello @AurelienFo and thank you for reporting!

Please diffs as the one at the top per --type for all types showing up in icinga2 daemon -C.

Best, AK

AurelienFo commented 4 years ago

Thanks for your answer.

master01:

icinga2 daemon  -C
[2020-11-26 08:39:20 +0100] information/cli: Icinga application loader (version: r2.11.3-1)
[2020-11-26 08:39:20 +0100] information/cli: Loading configuration file(s).
[2020-11-26 08:39:21 +0100] information/ConfigItem: Committing config item(s).
[2020-11-26 08:39:21 +0100] information/ApiListener: My API identity: master01.XXXX
[2020-11-26 08:39:21 +0100] warning/ApplyRule: Apply rule 'mail-icingaadmin' (in /etc/icinga2/conf.d/notifications.conf: 11:1-11:45) for type 'Notification' does not match anywhere!
[2020-11-26 08:39:21 +0100] warning/ApplyRule: Apply rule 'mail-icingaadmin' (in /etc/icinga2/conf.d/notifications.conf: 23:1-23:48) for type 'Notification' does not match anywhere!
[2020-11-26 08:39:21 +0100] warning/ApplyRule: Apply rule 'opsgenie-host-notification-' (in /etc/icinga2/zones.d/global-templates/common/notifications.conf: 70:9-70:197) for type 'Notification' does not match anywhere!
[2020-11-26 08:39:21 +0100] warning/ApplyRule: Apply rule 'backup-downtime' (in /etc/icinga2/conf.d/downtimes.conf: 5:1-5:52) for type 'ScheduledDowntime' does not match anywhere!
[2020-11-26 08:39:21 +0100] warning/ApplyRule: Apply rule 'ssh' (in /etc/icinga2/zones.d/global-templates/packs/ssh/services/ssh.conf: 1:0-1:18) for type 'Service' does not match anywhere!
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 682 HostGroups.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1 EventCommand.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 6 NotificationCommands.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1147 Notifications.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1031 Hosts.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1 PerfdataWriter.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 29 Zones.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 56 Endpoints.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 10 ApiUsers.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 243 CheckCommands.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1 IdoPgsqlConnection.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 4 TimePeriods.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 55 UserGroups.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 4 Users.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 1433 Services.
[2020-11-26 08:39:21 +0100] information/ConfigItem: Instantiated 3 ServiceGroups.
[2020-11-26 08:39:22 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2020-11-26 08:39:22 +0100] information/cli: Finished validating the configuration file(s).

master02:

icinga2 daemon -C
[2020-11-26 08:41:45 +0100] information/cli: Icinga application loader (version: r2.11.3-1)
[2020-11-26 08:41:45 +0100] information/cli: Loading configuration file(s).
[2020-11-26 08:41:46 +0100] information/ConfigItem: Committing config item(s).
[2020-11-26 08:41:46 +0100] information/ApiListener: My API identity: master02.XXXX
[2020-11-26 08:41:54 +0100] warning/ApplyRule: Apply rule 'mail-icingaadmin' (in /etc/icinga2/conf.d/notifications.conf: 11:1-11:45) for type 'Notification' does not match anywhere!
[2020-11-26 08:41:54 +0100] warning/ApplyRule: Apply rule 'mail-icingaadmin' (in /etc/icinga2/conf.d/notifications.conf: 23:1-23:48) for type 'Notification' does not match anywhere!
[2020-11-26 08:41:54 +0100] warning/ApplyRule: Apply rule 'opsgenie-host-notification-' (in /etc/icinga2/zones.d/global-templates/common/notifications.conf: 70:9-70:197) for type 'Notification' does not match anywhere!
[2020-11-26 08:41:54 +0100] warning/ApplyRule: Apply rule 'backup-downtime' (in /etc/icinga2/conf.d/downtimes.conf: 5:1-5:52) for type 'ScheduledDowntime' does not match anywhere!
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 682 HostGroups.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 1 EventCommand.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 6 NotificationCommands.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 14807 Notifications.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 17851 Hosts.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 51 Downtimes.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 21 Comments.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 1 PerfdataWriter.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 29 Zones.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 56 Endpoints.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 10 ApiUsers.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 243 CheckCommands.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 1 IdoPgsqlConnection.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 3 TimePeriods.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 55 UserGroups.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 4 Users.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 35121 Services.
[2020-11-26 08:41:54 +0100] information/ConfigItem: Instantiated 3 ServiceGroups.
[2020-11-26 08:41:56 +0100] information/WorkQueue: #4 (DaemonUtility::LoadConfigFiles) items: 0, rate: 32.4/s (1944/min 1944/5min 1944/15min);
[2020-11-26 08:41:56 +0100] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2020-11-26 08:41:56 +0100] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2020-11-26 08:41:56 +0100] information/WorkQueue: #7 (IdoPgsqlConnection, ido-pgsql) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2020-11-26 08:41:57 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2020-11-26 08:41:57 +0100] information/cli: Finished validating the configuration file(s).

Differences here are so on Notifications / Hosts / Services.

notification: master01:

icinga2 object list --type notification | grep "* type = \"Notification\"" | wc -l
1147

master02:

icinga2 object list --type notification | grep "* type = \"Notification\"" | wc -l
14807

host: master01:

icinga2 object list --type host | grep "* type" | wc -l
1031

master02:

icinga2 object list --type host | grep "* type" | wc -l
17851

services: master01:

 icinga2 object list --type service | grep "type = \"Service\"" | wc -l
1433

master02:

icinga2 object list --type service | grep "type = \"Service\"" | wc -l
35121

Then on master01 , the warning:

[2020-11-26 08:39:21 +0100] warning/ApplyRule: Apply rule 'ssh' (in /etc/icinga2/zones.d/global-templates/packs/ssh/services/ssh.conf: 1:0-1:18) for type 'Service' does not match anywhere!

But: master01:

/var/lib/icinga2/api/packages/_api/XXXX-XXX/conf.d/hosts # grep -r "import \"ssh\"" * | wc -l
48

master02:

/var/lib/icinga2/api/packages/_api/YYYY-YYYY/conf.d/hosts # grep -r "import \"ssh\"" * | wc -l
48

What is strange is that master01 does it's job correctly because all hosts and service are well monitored .

Thanks for help.

Al2Klimov commented 4 years ago

So... you’ve figured out the problem's origin?

AurelienFo commented 4 years ago

Not at all. I can see the problem with icinga2 object list but I don't understand why ... On /var/lib/icinga2/api/packages/_api/XXXX-XXX/conf.d/hosts I've the same number of hosts. And on /etc/icinga2/zones.d config is the same... I'm continue to search but with no success for the moment.

Al2Klimov commented 4 years ago

Please have a look at the host lists w/o wc -l and compare them. Then you'll know where the other hosts come from.

AurelienFo commented 4 years ago

So:

On my master02:

icinga2 object list | grep "_api" | wc -l
166117

I can see all my objects which have been created on and with API:

  % declared in '/var/lib/icinga2/api/packages/_api/XXXX-XXX-XXX-XXX-XXXX/conf.d/hosts/server1.conf', lines 1:0-1:51
    % = modified in '/var/lib/icinga2/api/packages/_api/XXX-XXX-XXX-XXX/conf.d/hosts/server1.conf', lines 5:2-5:24

But on my master01, no:

icinga2 object list | grep "_api"  | wc -l
12
icinga2 object list | grep "_api"
      * value = "$cloudera_api_version$"
Object 'nscp_api' of type 'CheckCommand':
  * __name = "nscp_api"
      * value = "$nscp_api_password$"
      * value = "$nscp_api_host$"
      * value = "$nscp_api_port$"
      * value = "$nscp_api_arguments$"
      * value = "$nscp_api_query$"
  * command = [ "/usr/lib/nagios/plugins/check_nscp_api" ]
  * name = "nscp_api"
  * templates = [ "nscp_api", "plugin-check-command", "ipv4-or-ipv6" ]
    * nscp_api_host = "$check_address$"

So it seems due to API objects. What is strange is that master01 is "primary" and all my objects (static or api) are monitored. (as we can see on master01, objects are presents on

cd /var/lib/icinga2/api/packages/_api/XXX-XXX-XXX-XXX/conf.d/hosts/
ls | wc -l
16773

And all objects are visible and monitored on icingaweb. So I don't understand where problem should be...

Thanks for your answers.

Al2Klimov commented 4 years ago

Please share /etc/icinga2/features-available/api.conf of both nodes.

AurelienFo commented 4 years ago

On master01 and 02, this file is the same:

cat /etc/icinga2/features-available/api.conf
/**
 * The API listener is used for distributed monitoring setups.
 */
object ApiListener "api" {
  accept_config = true
  accept_commands = true
  ticket_salt = TicketSalt

  bind_host = "*"
  bind_port = 5665
}

object ApiUser "icingaweb2" {
  password = "XXXXXXXXXX"

  permissions = [
    {
      permission = "status/query"
    },     {
      permission = "actions/*"
    },     {
      permission = "objects/modify/*"
    },     {
      permission = "objects/query/*"
    }
  ]
}

object ApiUser "admin" {
  password = "XXXXXXXXXX"

  permissions = [
    {
      permission = "*"
    }
  ]
}

...

and 6 others users in the file with same type of config

Al2Klimov commented 4 years ago

Please try to copy missing folders like /var/lib/icinga2/api/packages/_api/XXXX-XXX-XXX-XXX-XXXX manually.

AurelienFo commented 4 years ago

Humm , with your last answer I can see:

On my master02:

cd /var/lib/icinga2/api/packages/_api/XXXXXXXXXXXX
ls -la
drwx------ 5 nagios nagios 4096 Aug  7 09:11 conf.d
-rw-r--r-- 1 nagios nagios  160 Aug  7 09:11 include.conf
drwx------ 2 nagios nagios 4096 Aug  7 09:11 zones.d

There's nothing on zones.d but on include.conf:

cat include.conf 
include "../active.conf"
if (ActiveStages["_api"] == "XXXXXXXXXXXXX") {
  include_recursive "conf.d"
  include_zones "_api", "zones.d"
}

But on my master01:

cd /var/lib/icinga2/api/packages/_api/YYYYYYYYYY
ls -la
drwx------ 5 nagios nagios 4096 Aug  7 09:10 conf.d

And there, no include.conf, so no include_recursive "conf.d"

it should be the reason, don't you think so?

Al2Klimov commented 4 years ago

Yes. As I said:

Please try to copy missing folders like /var/lib/icinga2/api/packages/_api/XXXX-XXX-XXX-XXX-XXXX manually.

AurelienFo commented 3 years ago

Sorry for delay, I've just done it, all is now ok (objects iso). Then I had a problem which was on every restart on master01, it take long time because it recreated all api objects and with this fix, it's now ok! Great! Thanks for help!