Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2k stars 574 forks source link

Config sync still works after changing the zone name to one that does not exist on the agent #9684

Closed willfurnell closed 1 year ago

willfurnell commented 1 year ago

Describe the bug

If the zone name is changed on the agent, configuration is still synced without issues from the master.

I have a test master-master setup with a couple of agents. I originally set everything up with the zone name icinga2-ha-zone and I am testing what happens when I change this. I have renamed the zone to tier1-zone, and changed references to this on the masters, in the zones.d folder and on one of the agents.

On one of the agents, I left the zones.conf as-is to see what would happen. For some reason, it kept working fine. I then changed the zone name in the zones.conf on this agent to a random string fhfodfbhdjdjk, and it still kept working!

To Reproduce

Have the following configuration on the nodes: zones.conf on the config master:

object Endpoint "icinga-test-server1.example.ac.uk" {
}
object Endpoint "icinga-test-server2.example.ac.uk" {
        host = "icinga-test-server2.example.ac.uk"
        port = "5665"
}
object Zone "tier1-zone" {
        endpoints = [ "icinga-test-server1.example.ac.uk", "icinga-test-server2.example.ac.uk" ]
}
object Zone "global-templates" {
        global = true
}
object Zone "director-global" {
        global = true
}

agents.conf in the zones.d/zone-name/ on the config master:

object Endpoint "icinga-test-agent1.example.ac.uk" {
  host = "icinga-test-agent1.example.ac.uk" // The master actively tries to connect to the agent
  log_duration = 0 // Disable the replay log for command endpoint agents
}
object Zone "icinga-test-agent1.example.ac.uk" {
  endpoints = ["icinga-test-agent1.example.ac.uk"]
  parent = "tier1-zone"
}
object Endpoint "icinga-test-agent2.example.ac.uk" {
  host = "icinga-test-agent2.example.ac.uk" // The master actively tries to connect to the agent
  log_duration = 0 // Disable the replay log for command endpoint agents
}
object Zone "icinga-test-agent2.example.ac.uk" {
  endpoints = ["icinga-test-agent2.example.ac.uk"]
  parent = "tier1-zone"
}

zones.conf on the agent that still works after changing the zone name to a random string:

object Endpoint "icinga-test-server1.example.ac.uk" {
}

object Endpoint "icinga-test-server2.example.ac.uk" {
}

object Zone "fhfodfbhdjdjk" {
        endpoints = [ "icinga-test-server1.example.ac.uk", "icinga-test-server2.example.ac.uk" ]
}
object Endpoint "icinga-test-agent2.example.ac.uk" {
}
object Zone "icinga-test-agent2.example.ac.uk" {
        endpoints = [ "icinga-test-agent2.example.ac.uk" ]
        parent = "fhfodfbhdjdjk"
}
object Zone "global-templates" {
        global = true
}
object Zone "director-global" {
        global = true
}

Check the logs on the agent - it will show that it is working fine and configuration is somehow being synced with a zone that does not exist:

[2023-02-09 09:24:12 +0000] information/ApiListener: Sending config updates for endpoint 'icinga-test-server1.example.ac.uk' in zone 'fhfodfbhdjdjk'.
[2023-02-09 09:24:12 +0000] information/ApiListener: Finished sending config file updates for endpoint 'icinga-test-server1.example.ac.uk' in zone 'fhfodfbhdjdjk'.
[2023-02-09 09:24:12 +0000] information/ApiListener: Syncing runtime objects to endpoint 'icinga-test-server1.example.ac.uk'.
[2023-02-09 09:24:12 +0000] information/ApiListener: Applying config update from endpoint 'icinga-test-server1.example.ac.uk' of zone 'fhfodfbhdjdjk'.
[2023-02-09 09:24:12 +0000] information/ApiListener: Finished syncing runtime objects to endpoint 'icinga-test-server1.example.ac.uk'.
[2023-02-09 09:24:12 +0000] information/ApiListener: Finished sending runtime config updates for endpoint 'icinga-test-server1.example.ac.uk' in zone 'fhfodfbhdjdjk'.
[2023-02-09 09:24:12 +0000] information/ApiListener: Sending replay log for endpoint 'icinga-test-server1.example.ac.uk' in zone 'fhfodfbhdjdjk'.
[2023-02-09 09:24:12 +0000] information/ApiListener: Received configuration updates (0) from endpoint 'icinga-test-server1.example.ac.uk' are equal to production, skipping validation and reload.
[2023-02-09 09:24:12 +0000] information/ApiListener: Finished sending replay log for endpoint 'icinga-test-server1.example.ac.uk in zone 'fhfodfbhdjdjk'.
[2023-02-09 09:24:12 +0000] information/ApiListener: Finished syncing endpoint 'icinga-test-server1.example.ac.uk' in zone 'fhfodfbhdjdjk'.

Expected behavior

Changing the zone name to one that does not exist on the agent breaks configuration syncing - or a message is shown in the logs to note that it doesn't matter if the zone name on the agent is changed.

Your Environment

Include as many relevant details about the environment you experienced the problem in

Al2Klimov commented 1 year ago

Hello Will!

Is your agent a pure command endpoint (which doesn’t schedule checks by itself)?

willfurnell commented 1 year ago

Yes that is the case - checks are scheduled by the master/secondary

Al2Klimov commented 1 year ago

Then the only what matters are the commands in global zones. So the remote zone which doesn’t exist locally doesn’t matter due to lack of checkables and the local zone which doesn’t exist remotely just enables trusting the parent node.

willfurnell commented 1 year ago

Ah - so if I'm doing the top down config sync - I only actually need the global zones on the agents? (and the agents own zone?)

Al2Klimov commented 1 year ago

... and the parent one for trusting the parent node in case of a a pure command endpoint.

willfurnell commented 1 year ago

But then I don't understand @Al2Klimov - as the example shows I change the name of the parent zone to something random and it still works?

object Zone "fhfodfbhdjdjk" {
        endpoints = [ "icinga-test-server1.example.ac.uk", "icinga-test-server2.example.ac.uk" ]
}
object Endpoint "icinga-test-agent2.example.ac.uk" {
}
object Zone "icinga-test-agent2.example.ac.uk" {
        endpoints = [ "icinga-test-agent2.example.ac.uk" ]
        parent = "fhfodfbhdjdjk"
}

whereas in this case the parent zone is named "icinga2-ha-zone" in reality on the server.

Al2Klimov commented 1 year ago

PoV: you're a pure agent.

  1. Oh, a new connection. Who's there?
  2. Identified peer as "master1". Who's this again... ?
  3. A member of zone "a4e45008aa463becdc3503c57d226dce", ok...
  4. Oh, that's our parent zone!
  5. So we trust "master1" and accept command execution requests.
willfurnell commented 1 year ago

Thank you - but how does the agent now what it's parent zone is if that's not specified in the zones.conf? How does it decide what it's parent zone is in step 4 please? Or does the name not matter - as long as the master servers are specified there? The parent zone specified in the client config does not need to match the zone name specified in the configuration on the master - as long as the master endpoints are specified like this? endpoints = [ "icinga-test-server1.example.ac.uk", "icinga-test-server2.example.ac.uk" ] Sorry for all the questions...

Al2Klimov commented 1 year ago
  1. It doesn’t.
  2. According to zones.conf.
  3. Yes.
  4. Yes.
willfurnell commented 1 year ago

Amazing, thank you!