Open mschroeder21 opened 1 year ago
If you run multiple Icinga 2 instances, the
zones.conf
file (oricinga2 object list --type Endpoint
andicinga2 object list --type Zone
) from all affected nodes.: HA Master + Agents (Details can not be shared publicly)
Can you please share at least some more details on the structure, in particular:
command_endpoint
?Yes, the agents are directly connected to the masters. We don't have any satellites in this enviroment.
Yes, the check is executed using command_endpoint
Any news or ideas what's happening here?
This might be unrelated to this issue, but we are seeing issues with notifications being sent when a host should be in downtime after a config change is made. If a host or service problem is acknowledged, or put into scheduled downtime, and then a configuration change is made via Icinga Director, those acks and downtimes are purged - this occurs regardless of what zone the Director change is made in.
e.g. I ack a host problem in for host-a
in zone-a
, create a scheduled downtime for host-b
in zone-b
, and then push a Director config change for host-c
in zone-c
- the ack's for host-a
and downtime for host-b
are removed, but are in the history.
Additionally, when the Icinga master reloads config after a Director deployment, we are seeing a race condition that causes hosts to send down notifications, and a few minutes later, enter downtime:
Icinga Web 2 Version 2.11.4 Icinga2 Version r2.13.7-1
Is there a possibility to get feedback on this topic?
My first guess would be that there could be some inconsistency between both masters. While inside the downtime, you can request https://localhost:5665/v1/objects/services/affected-host-name!affected-service-name
from both masters and compare what you get. downtime_depth
would be of particular interest as this shows if both masters agree on whether the service is in a downtime.
Thanks for your answer. Both masters are in sync (downtime_depth is 1 if a service is in downtime). I have also already cleaned /var/lib/icinga2/api/zones/
several times on the second master to get a fresh sync from the config master.
Something like what @0xliam describes happens in our setup from time to time as well.
The config deployment by the Director was triggered at 18:00 At 18:01:45 configs were synced between the masters (with the config master ignoring the updates from the secondary master) and into the zones. That all was finished at 18:02:10 Between 18:02:10 and 18:02:40 multiple downtimes where created and those successfully suppressed notifications. At 18:02:41 a whole lot of "Syncing configuration files for xyz to " messages (re)appear in the log without a config deployment being triggered, only for the masters The downtime from the screenshot was entered at 18:02:43
Log line for the host that was notified during the supposed downtime
### DOWNTIME ENTERED VIA API ###
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!memory-toplist!13960866-f66f-4804-b527-625b31b85818' for checkable 'xyz-p1-ts2004!memory-toplist'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!memory-toplist!13960866-f66f-4804-b527-625b31b85818' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!memory_free_VD' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!memory_free_VD!bfa5736a-5b2d-4061-8c2b-6ab3290d508c' for checkable 'xyz-p1-ts2004!memory_free_VD'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!memory_free_VD!bfa5736a-5b2d-4061-8c2b-6ab3290d508c' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!pending_updates' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!pending_updates!849fa58a-fbf2-43cf-97c7-cd6cd5c83a5a' for checkable 'xyz-p1-ts2004!pending_updates'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!pending_updates!849fa58a-fbf2-43cf-97c7-cd6cd5c83a5a' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!pending_updates_security-only' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!pending_updates_security-only!8f643388-a8e4-40dc-80f1-c57c97fdbce3' for checkable 'xyz-p1-ts2004!pending_updates_security-only'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!pending_updates_security-only!8f643388-a8e4-40dc-80f1-c57c97fdbce3' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!rdp-x224-status' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!rdp-x224-status!024afb09-1141-47a4-8398-de52c761102d' for checkable 'xyz-p1-ts2004!rdp-x224-status'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!rdp-x224-status!024afb09-1141-47a4-8398-de52c761102d' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!sentinelone-agent-status' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!sentinelone-agent-status!d1fd675e-8976-41cd-855e-14d557cf7cd6' for checkable 'xyz-p1-ts2004!sentinelone-agent-status'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!sentinelone-agent-status!d1fd675e-8976-41cd-855e-14d557cf7cd6' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!sentinelone_application_security!8ed4fb6c-989c-4040-8c72-d0e30f2e73a6' for checkable 'xyz-p1-ts2004!sentinelone_application_security'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!sentinelone_application_security!8ed4fb6c-989c-4040-8c72-d0e30f2e73a6' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!sentinelone_threats' has 2 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!sentinelone_threats!f3f5f9fe-2b39-41d1-8dbc-43601c96fba0' for checkable 'xyz-p1-ts2004!sentinelone_threats'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!sentinelone_threats!f3f5f9fe-2b39-41d1-8dbc-43601c96fba0' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-dcomlaunch' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-dcomlaunch!3d0ef4ab-a08d-4b44-961a-c6b512c13bd8' for checkable 'xyz-p1-ts2004!service-dcomlaunch'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-dcomlaunch!3d0ef4ab-a08d-4b44-961a-c6b512c13bd8' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-eventlog' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-eventlog!06e304df-196f-479f-8f1e-7a577bb46b01' for checkable 'xyz-p1-ts2004!service-eventlog'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-eventlog!06e304df-196f-479f-8f1e-7a577bb46b01' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-frxsvc' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-frxsvc!a321407e-ef41-43fb-a890-16e1d46d6a0f' for checkable 'xyz-p1-ts2004!service-frxsvc'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-frxsvc!a321407e-ef41-43fb-a890-16e1d46d6a0f' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-gpsvc' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-gpsvc!3dd5276b-a26c-43c0-96ea-fd3149409c6a' for checkable 'xyz-p1-ts2004!service-gpsvc'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-gpsvc!3dd5276b-a26c-43c0-96ea-fd3149409c6a' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-lanmanserver' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-lanmanserver!c0edc4d2-5989-4378-8e2f-d2c81d63ba84' for checkable 'xyz-p1-ts2004!service-lanmanserver'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-lanmanserver!c0edc4d2-5989-4378-8e2f-d2c81d63ba84' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-lanmanworkstation' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-lanmanworkstation!ba0dccab-c0a3-48ca-9bca-b4fe80d63b6d' for checkable 'xyz-p1-ts2004!service-lanmanworkstation'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-lanmanworkstation!ba0dccab-c0a3-48ca-9bca-b4fe80d63b6d' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-logprocessorservice' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-logprocessorservice!7f79485e-4410-4780-b444-67d4a1ed4200' for checkable 'xyz-p1-ts2004!service-logprocessorservice'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-logprocessorservice!7f79485e-4410-4780-b444-67d4a1ed4200' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-mpssvc' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-mpssvc!ffaf3b73-dda1-440c-93fb-8296341594a6' for checkable 'xyz-p1-ts2004!service-mpssvc'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-mpssvc!ffaf3b73-dda1-440c-93fb-8296341594a6' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-rdagentbootloader' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-rdagentbootloader!87c04bc6-c807-4bdd-ab85-e07d2878d1dc' for checkable 'xyz-p1-ts2004!service-rdagentbootloader'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-rdagentbootloader!87c04bc6-c807-4bdd-ab85-e07d2878d1dc' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-rpcss' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-rpcss!f14e490e-9a7e-4270-932b-d8fd7df81396' for checkable 'xyz-p1-ts2004!service-rpcss'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-rpcss!f14e490e-9a7e-4270-932b-d8fd7df81396' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-schedule' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-schedule!ca162fec-1260-43de-a524-cc17e2d2d869' for checkable 'xyz-p1-ts2004!service-schedule'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-schedule!ca162fec-1260-43de-a524-cc17e2d2d869' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-sentinelagent' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-sentinelagent!9fd29be4-2d24-41b1-83e0-3bf3063ecad4' for checkable 'xyz-p1-ts2004!service-sentinelagent'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-sentinelagent!9fd29be4-2d24-41b1-83e0-3bf3063ecad4' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-sentinelstaticengine' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-sentinelstaticengine!70b79481-44e0-4e8c-85f0-dd49b5c4de3c' for checkable 'xyz-p1-ts2004!service-sentinelstaticengine'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-sentinelstaticengine!70b79481-44e0-4e8c-85f0-dd49b5c4de3c' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-winmgmt' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-winmgmt!3b77eab9-b3f0-46f6-b5fd-462848b0c4ce' for checkable 'xyz-p1-ts2004!service-winmgmt'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-winmgmt!3b77eab9-b3f0-46f6-b5fd-462848b0c4ce' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-winrm' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-winrm!fc171c9b-2109-401c-a196-51844220b4f8' for checkable 'xyz-p1-ts2004!service-winrm'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-winrm!fc171c9b-2109-401c-a196-51844220b4f8' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!software_inventory!3334eae7-e6a7-4186-8345-890e5a069cc5' for checkable 'xyz-p1-ts2004!software_inventory'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!software_inventory!3334eae7-e6a7-4186-8345-890e5a069cc5' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!userprofile-containers' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!userprofile-containers!422d1ce1-620f-45fd-b868-b996867d2610' for checkable 'xyz-p1-ts2004!userprofile-containers'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!userprofile-containers!422d1ce1-620f-45fd-b868-b996867d2610' of type 'Downtime'.
### NOTIFICATION WAS SENT ###
[2024-08-07 18:08:23 +0200] information/Checkable: Checkable 'xyz-p1-ts2004' has 1 notification(s). Checking filters for type 'Problem', sends will be logged.
[2024-08-07 18:08:25 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!90d985c3-0bc3-40f4-8cfc-00fbb741fdbc' of type 'Comment'.
[2024-08-07 18:08:25 +0200] information/Checkable: Acknowledgement set for checkable 'xyz-p1-ts2004'.
### DOWNTIME WAS STARTED WITH A DELAY of ~ 50 minutes ###
[2024-08-07 18:51:11 +0200] information/Checkable: Checkable 'xyz-p1-ts2004' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!ef6293dc-916f-4db8-8e4b-aff0c37744ef' for checkable 'xyz-p1-ts2004'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!ef6293dc-916f-4db8-8e4b-aff0c37744ef' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!cpu!b98242bc-a9ec-4561-ac64-3292c0779221' for checkable 'xyz-p1-ts2004!cpu'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!cpu!b98242bc-a9ec-4561-ac64-3292c0779221' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!cpu-toplist!9efb9f31-2a70-4b6f-a454-92ba04977230' for checkable 'xyz-p1-ts2004!cpu-toplist'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!cpu-toplist!9efb9f31-2a70-4b6f-a454-92ba04977230' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!disk' has 2 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!disk!1619bb9b-96ed-432b-89f5-29774260bfd0' for checkable 'xyz-p1-ts2004!disk'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!disk!1619bb9b-96ed-432b-89f5-29774260bfd0' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!icinga-agent-parent-service!e646196c-02dd-4ca6-a76c-28c87e3aa1cc' for checkable 'xyz-p1-ts2004!icinga-agent-parent-service'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!icinga-agent-parent-service!e646196c-02dd-4ca6-a76c-28c87e3aa1cc' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!icinga-agent-version!bd76f856-586e-47ab-9d9e-880893090628' for checkable 'xyz-p1-ts2004!icinga-agent-version'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!icinga-agent-version!bd76f856-586e-47ab-9d9e-880893090628' of type 'Downtime'.
More occurences of this. "Light mode" is a downtime set via API on host shutdown. "Dark mode" is a scheduled downtime.
Debug logs can be provided if helpful!
Describe the bug
Notifications are not suppressed during (scheduled) Downtimes.
To Reproduce
Expected behavior
Notification will be suppressed.
Screenshots
Your Environment
Include as many relevant details about the environment you experienced the problem in
icinga2 --version
): r2.13.6-1icinga2 feature list
): api checker ido-mysql influxdb mainlog notificationicinga2 daemon -C
): completes without errors.zones.conf
file (oricinga2 object list --type Endpoint
andicinga2 object list --type Zone
) from all affected nodes.: HA Master + Agents (Details can not be shared publicly)Additional context
Maybe helpful:
max_check_attempt
is set to 1 for this check.