Closed icinga-migration closed 7 years ago
Updated by mfriedrich on 2015-09-07 15:32:42 +00:00
From a quick read, the feature request is to delay recovery notifications for some reason. I'm not really sure I get the problem itself, how would a soft recovery requiring additional steps in SOFT-OK then result in a HARD-OK triggering the recovery notification? That sounds pretty weird to me.
Probably you should come up with some drawing boards to illustrate the timing and intervals including all involved configuration attributes influencing the state machine.
Note: I would consider this for Icinga 2 only. We won't implement such (breaking) changes in 1.x.
Updated by leo9641 on 2015-10-06 11:11:54 +00:00
dnsmichi wrote:
From a quick read, the feature request is to delay recovery notifications for some reason. I'm not really sure I get the problem itself, how would a soft recovery requiring additional steps in SOFT-OK then result in a HARD-OK triggering the recovery notification? That sounds pretty weird to me.
Probably you should come up with some drawing boards to illustrate the timing and intervals including all involved configuration attributes influencing the state machine.
Note: I would consider this for Icinga 2 only. We won't implement such (breaking) changes in 1.x.
I wrote a simple wrapper for this feature (only for gw-host check, set UP state for gw-host after successful maxhostattempts retries in a row ): https://gist.github.com/lvasiliev/6c847511e53509c8db51
It is necessary to slow down UP-state for gw hosts, because child hosts depend on them (parents -> gw-host).
Template for gw-host (only timing): define host{ name generic-router-extadm check_interval 2 ; Switches are checked every 2 minutes retry_interval 1 ; Schedule host check retries at 1 minute intervals max_check_attempts 5 ; Check each switch 5 times (max) check_command check-gw-alive-extadm ; Default command to check if routers are "alive" }
Template for hosts (only timing): define host{ name freebsd-server-extadm ; The name of this host template check_interval 4 ; Actively check the host every 4 minutes retry_interval 1 ; Schedule host check retries at 1 minute intervals max_check_attempts 10 ; Check each FreeBSD host 10 times (max) }
I want that UP-state from DOWN for gw hosts was more slow (in case of network unstable, packets loss). In this period child hosts has UNREACHABLE state and don't send notifications. Sometimes happens that gw-host can quickly be UP from DOWN state. But checks of child hosts still return non-OK state (WARNING, CRITICAL ) and after max_check_attempts host is DOWN state. Then gw-host is DOWN state again...
I use options soft_state_dependencies=1.
Updated by mfriedrich on 2015-10-26 08:22:14 +00:00
Ok. If someone comes up with a patch which does not break the existing behaviour we might have a look into it.
Updated by mfriedrich on 2015-10-26 08:22:21 +00:00
This issue has been migrated from Redmine: https://dev.icinga.com/issues/10114
Created by leo9641 on 2015-09-07 15:10:23 +00:00
Assignee: (none) Status: New Target Version: Backlog Last Update: 2015-10-26 08:22:20 +00:00 (in Redmine)
Hi icinga team!
Please consider this request too:
https://github.com/NagiosEnterprises/nagioscore/issues/46