Closed jonaschl closed 4 years ago
What's within the master's debug log for dumping the replayed check results into the IDO database backend?
Hi,
thanks for your answer. I am not sure which lines are useful for debugging, so I attached the debug log of the server from 10.00am to 11.00 am. debug-server.log Node and Server are using the same ntp server so time is in snyc.
Hmmm, can you give me a hint where
object Service "Network-Status-2" {
import "generic-service"
check_command = "ping"
host_name = "sirius.mittelerde.local"
vars.ping_address = "192.168.141.2"
}
is located physically on disk? icinga2 object list --type Service --name *Network-Status*
is sufficient on the master.
Hi,
here is the output of the command you asked for:
Object 'sirius.mittelerde.local!Network-Status-2' of type 'Service':
% declared in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 77:1-77:33
* __name = "sirius.mittelerde.local!Network-Status-2"
* action_url = ""
* check_command = "ping"
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 79:1-79:22
* check_interval = 60
% = modified in '/etc/icinga2/zones.d/global-templates/templates.conf', lines 28:3-28:21
* check_period = ""
* check_timeout = null
* command_endpoint = ""
* display_name = "Network-Status-2"
* enable_active_checks = true
* enable_event_handler = true
* enable_flapping = false
* enable_notifications = true
* enable_passive_checks = true
* enable_perfdata = true
* event_command = ""
* flapping_threshold = 0
* flapping_threshold_high = 30
* flapping_threshold_low = 25
* groups = [ ]
* host_name = "sirius.mittelerde.local"
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 80:1-80:37
* icon_image = ""
* icon_image_alt = ""
* max_check_attempts = 5
% = modified in '/etc/icinga2/zones.d/global-templates/templates.conf', lines 27:3-27:24
* name = "Network-Status-2"
* notes = ""
* notes_url = ""
* package = "_etc"
* retry_interval = 30
% = modified in '/etc/icinga2/zones.d/global-templates/templates.conf', lines 29:3-29:22
* source_location
* first_column = 1
* first_line = 77
* last_column = 33
* last_line = 77
* path = "/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf"
* templates = [ "Network-Status-2", "generic-service" ]
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 77:1-77:33
% = modified in '/etc/icinga2/zones.d/global-templates/templates.conf', lines 26:1-26:34
* type = "Service"
* vars
* ping_address = "192.168.141.2"
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 81:1-81:35
* volatile = false
* zone = "sirius.mittelerde.local"
Object 'sirius.mittelerde.local!Network-Status' of type 'Service':
% declared in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 67:1-67:31
* __name = "sirius.mittelerde.local!Network-Status"
* action_url = ""
* check_command = "ping-windows"
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 69:1-69:30
* check_interval = 60
% = modified in '/etc/icinga2/zones.d/global-templates/templates.conf', lines 28:3-28:21
* check_period = ""
* check_timeout = null
* command_endpoint = ""
* display_name = "Network-Status"
* enable_active_checks = true
* enable_event_handler = true
* enable_flapping = false
* enable_notifications = true
* enable_passive_checks = true
* enable_perfdata = true
* event_command = ""
* flapping_threshold = 0
* flapping_threshold_high = 30
* flapping_threshold_low = 25
* groups = [ ]
* host_name = "sirius.mittelerde.local"
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 70:1-70:37
* icon_image = ""
* icon_image_alt = ""
* max_check_attempts = 5
% = modified in '/etc/icinga2/zones.d/global-templates/templates.conf', lines 27:3-27:24
* name = "Network-Status"
* notes = ""
* notes_url = ""
* package = "_etc"
* retry_interval = 30
% = modified in '/etc/icinga2/zones.d/global-templates/templates.conf', lines 29:3-29:22
* source_location
* first_column = 1
* first_line = 67
* last_column = 31
* last_line = 67
* path = "/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf"
* templates = [ "Network-Status", "generic-service" ]
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 67:1-67:31
% = modified in '/etc/icinga2/zones.d/global-templates/templates.conf', lines 26:1-26:34
* type = "Service"
* vars
* max_check_attempts = 1
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 72:1-72:27
* ping_win_address = "192.168.141.4"
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 71:1-71:39
* ping_win_crit = [ 1000, 100 ]
% = modified in '/etc/icinga2/zones.d/sirius.mittelerde.local/services.conf', lines 73:1-73:34
* volatile = false
* zone = "sirius.mittelerde.local"
Hm, out of ideas here. Maybe the check results are considered old and are dropped for that very reason on replay logs. Or you are bitten by a replay log bug which will be fixed for 2.11. A good catch would be testing the snapshot packages: https://icinga.com/docs/icinga2/snapshot/doc/21-development/#snapshot-packages-nightly-builds
Either 2.11 fixed this already, or the new IcingaDB backend will do so.
Expected Behavior
When a client node lost network connectivity, and the replay log feature is enabled, state changes (from OK to CRITICAL for example) should appear in the history tab of the service in icingaweb.
Current Behavior
Replay log seems to work in my environment, so the file C:\ProgramData\icinga2\var\lib\icinga2\api\log\current get filled with messages, (see this file current.txt ) and is empty after the connection is restored. The log on the Windows 10 Education client also states that messages have been replayed.
debug-node.log
State changes which definitely happen (some services which are checked from this node depend on the network ) do not appear in the history tab of icingaweb.
Steps to Reproduce (for bugs)
Context
I have a couple of machines were I cannot guarantee network connection, so I thought about using the replay log feature to see states changes at least when the connection comes up again. This does not seem to work.
Your Environment
icinga2 --version
):Server: icinga2 - The Icinga 2 network monitoring daemon (version: r2.10.2-1) Node: icinga2.exe - The Icinga 2 network monitoring daemon (version: v2.10.2)
icinga2 feature list
): Server: Disabled features: compatlog elasticsearch gelf graphite influxdb livestatus opentsdb perfdata statusdata syslog Enabled features: api checker command debuglog ido-mysql mainlog notificationNode: Disabled features: command compatlog elasticsearch gelf graphite ido-mysql ido-pgsql influxdb livestatus notification opentsdb perfdata statusdata Enabled features: api checker debuglog mainlog
Icinga Web 2 version and modules (System - About): Icinga Web 2 Version 2.6.2 Modules: doc, monitoring
Config validation (
icinga2 daemon -C
): Server (Config Sync is enabled)If you run multiple Icinga 2 instances, the
zones.conf
file (oricinga2 object list --type Endpoint
andicinga2 object list --type Zone
) from all affected nodes.Zones:
Thanks for your support.
Jonatan