Closed jschanz closed 5 years ago
I think it has something to do with name resolution. If no entry is set in /etc/hosts, getaddrinfo fails without network. If a entry is set in /etc/host, FQDN is set, also without network. Maybee it's only a documentation update to set an entry in /etc/hosts, which I could also do later.
I'll get also messages like these:
2018-11-08T03:05:51.734071+01:00 icinga-01 icinga2[694]: [2018-11-08 03:05:51 +0100] critical/TcpSocket: getaddrinfo() failed with error code -2, "Name or service not known"
2018-11-08T03:05:51.747005+01:00 icinga-01 icinga2[694]: [2018-11-08 03:05:51 +0100] critical/TcpSocket: getaddrinfo() failed with error code -2, "Name or service not known"
I looked this up yesterday: At startup Icinga calls getaddrinfo to get the FQDN, if that fails hostname and if that fails it uses 'localhost'.
I don't think there is anything we can do about this either, except document it :woman_shrugging:
Just to ensure, @jschanz can you show the content of the systemd icinga2.service unit?
It should contain After=... network-online.target ...
, which should be enough. If it is not enough like in your case ensure the wait daemon corresponding the network managing daemon is enabled (systemctl is-enabled NetworkManager-wait-online.service systemd-networkd-wait-online.service
). If this is not enough I would say it is a problem of this daemon instead of Icinga 2.
Have a look for further details at https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
@dgoetz
[Unit]
Description=Icinga host/service/network monitoring system
After=syslog.target network-online.target postgresql.service mariadb.service carbon-cache.service carbon-relay.service
[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/icinga2
ExecStartPre=/usr/lib/icinga2/prepare-dirs /etc/sysconfig/icinga2
ExecStart=/usr/sbin/icinga2 daemon -e /var/log/icinga2/error.log
PIDFile=/var/run/icinga2/icinga2.pid
ExecReload=/usr/lib/icinga2/safe-reload /etc/sysconfig/icinga2
TimeoutStartSec=30m
# Systemd >228 enforces a lower process number for services.
# Depending on the distribution and Systemd version, this must
# be explicitly raised. Packages will set the needed values
# into /etc/systemd/system/icinga2.service.d/limits.conf
#
# Please check the troubleshooting documentation for further details.
# The values below can be used as examples for customized service files.
#TasksMax=infinity
#LimitNPROC=62883
[Install]
WantedBy=multi-user.target
Target "network" is reached after icinga2 start:
2018-11-07T17:08:46.507845+01:00 icinga-01 systemd[1]: Reached target Network.
but
2018-11-07T17:08:45.269265+01:00 icinga-01 systemd[1]: Failed to start Icinga host/service/network monitoring system.
I can reproduce this now ... Please unplug the network cable and try to use the following /etc/hosts
#
# hosts This file describes a number of hostname-to-address
# mappings for the TCP/IP subsystem. It is mostly
# used at boot time, when no name servers are running.
# On small systems, this file can be used instead of a
# "named" name server.
# Syntax:
#
# IP-Address Full-Qualified-Hostname Short-Hostname
#
127.0.0.1 localhost.localdomain localhost
So no adress resultion (local, dns, etc.) is possible. Icinga2 is unable to determine the FQDN with getaddrinfo and fails while looking up for the certs in /var/lib/icinga2/certs/ and won't start due to that.
Tested on SLES and OpenSUSE. Needs more testing in other environments.
Remove entry with
I tried to reproduce on CentOS 7. On CentOS7 with NetworkManager.service and NetworkManager-wait-online.service enabled Icinga 2 is always started after networking. Enabling the old network.service and disabling NetworkManager.service and NetworkManager-wait-online.service gave me the same problem. Disabling network.service and only enabling NetworkManager.service also did not cause a problem. So it is totally depending on the network managing service.
With an additional Requires
it also works for network.service only. While the Requires can delay start up of the system, I would say lets add it.
Icinga2 startup fails, if network stack is not fully loaded. Not sure, if this is a systemd or icinga2 related problem.
Icinga2 can't determine the FQDN of the host, if the startup of the network stack tooks longer than usual (e.g. if you use a brdige and several network interfaces.
Icinga2 does a fallback or could only get the hostname, but not the domain of the host, and fails while loading the certs to startup.
hostname is "icinga-01" domain is "localdomain.local" fqdn is "icinga-01.localdomain.local"
certs are stored with fqdn naming scheme
full log of initialization ...
If you do a restart after system is fully started, everything works as expected and the service is started.
Expected Behavior
Shouldn't fail
Current Behavior
Fails sometimes, if initialization of network stack is slow
Possible Solution
Steps to Reproduce (for bugs)
Not reproducible everytime, because sometimes it works, sometimes not.
Your Environment
icinga2 --version
): icinga2 - The Icinga 2 network monitoring daemon (version: r2.10.1-1)Copyright (c) 2012-2018 Icinga Development Team (https://icinga.com/) License GPLv2+: GNU GPL version 2 or later http://gnu.org/licenses/gpl2.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
System information: Platform: openSUSE Platform version: 13.1 (Bottle) Kernel: Linux Kernel version: 3.11.10-29-desktop Architecture: i686
Build information: Compiler: GNU 4.8.1 Build host: server342vmx
Application information:
General paths: Config directory: /etc/icinga2 Data directory: /var/lib/icinga2 Log directory: /var/log/icinga2 Cache directory: /var/cache/icinga2 Spool directory: /var/spool/icinga2 Run directory: /var/run/icinga2
Old paths (deprecated): Installation root: /usr Sysconf directory: /etc Run directory (base): /var/run Local state directory: /var
Internal paths: Package data directory: /usr/share/icinga2 State path: /var/lib/icinga2/icinga2.state Modified attributes path: /var/lib/icinga2/modified-attributes.conf Objects path: /var/cache/icinga2/icinga2.debug Vars path: /var/cache/icinga2/icinga2.vars PID path: /var/run/icinga2/icinga2.pid
openSUSE 13.1 (i586) VERSION = 13.1 CODENAME = Bottle