Closed akqopensystems closed 6 years ago
You can set these limits in the sysconfig file. See the "advanced" table in this chapter: https://www.icinga.com/docs/icinga2/latest/doc/17-language-reference/#constants
Our way of doing this may not be standard, for RHEL specific changes to init scripts and the sort please see https://github.com/Icinga/rpm-icinga2
Thanks for the clarification! I think these options would be better documented in the configuration chapter, maybe in a topic "Advanced configuration": https://www.icinga.com/docs/icinga2/latest/doc/04-configuring-icinga-2/ Unfortunately, this seems not to work as expected. On a test system with only slight differences to production:
[root@ossztlvmo12 icinga2]# cat /etc/sysconfig/icinga2
DAEMON=/usr/sbin/icinga2
ICINGA2_CONFIG_FILE=/etc/icinga2/icinga2.conf
ICINGA2_RUN_DIR=/run
ICINGA2_STATE_DIR=/var
ICINGA2_PID_FILE=$ICINGA2_RUN_DIR/icinga2/icinga2.pid
ICINGA2_LOG_DIR=/var/log/icinga2
ICINGA2_ERROR_LOG=$ICINGA2_LOG_DIR/error.log
ICINGA2_STARTUP_LOG=$ICINGA2_LOG_DIR/startup.log
ICINGA2_LOG=$ICINGA2_LOG_DIR/icinga2.log
ICINGA2_CACHE_DIR=$ICINGA2_STATE_DIR/cache/icinga2
ICINGA2_USER=icinga
ICINGA2_GROUP=icinga
ICINGA2_COMMAND_GROUP=icingacmd
ICINGA2_RLIMIT_FILES=50000
ICINGA2_RLIMIT_PROCESSES=62883
[root@ossztlvmo12 icinga2]# systemctl cat icinga2
# /usr/lib/systemd/system/icinga2.service
[Unit]
Description=Icinga host/service/network monitoring system
After=syslog.target network-online.target postgresql.service mariadb.service carbon-cache.service carbon-relay.service
[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/icinga2
ExecStartPre=/usr/lib/icinga2/prepare-dirs /etc/sysconfig/icinga2
ExecStart=/usr/sbin/icinga2 daemon -d -e ${ICINGA2_ERROR_LOG}
PIDFile=/run/icinga2/icinga2.pid
ExecReload=/usr/lib/icinga2/safe-reload /etc/sysconfig/icinga2
TimeoutStartSec=30m
# Systemd >228 enforces a lower process number for services.
# Depending on the distribution and Systemd version, this must
# be explicitly raised. Packages will set the needed values
# into /etc/systemd/system/icinga2.service.d/limits.conf
#
# Please check the troubleshooting documentation for further details.
# The values below can be used as examples for customized service files.
#TasksMax=infinity
#LimitNPROC=62883
[Install]
WantedBy=multi-user.target
[root@ossztlvmo12 icinga2]# systemctl show icinga2
[...]
LimitCPU=18446744073709551615
LimitFSIZE=18446744073709551615
LimitDATA=18446744073709551615
LimitSTACK=18446744073709551615
LimitCORE=18446744073709551615
LimitRSS=18446744073709551615
LimitNOFILE=4096
[...]
root@ossztlvmo12 icinga2]# systemctl stop icinga2
[root@ossztlvmo12 icinga2]# systemctl start icinga2
[root@ossztlvmo12 icinga2]# ps -ef |grep icinga2|grep -v plugin
icinga 14980 1 0 12:51 ? 00:00:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e $ICINGA2_LOG_DIR/error.log
icinga 14985 1 66 12:51 ? 00:00:04 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e $ICINGA2_LOG_DIR/error.log
[root@ossztlvmo12 icinga2]# cat /proc/14980/limits |grep "Max open"
Max open files 16384 16384 files
[root@ossztlvmo12 icinga2]# cat /proc/14985/limits |grep "Max open"
Max open files 16384 16384 files
We are using the Icinga2 rpm repository.
You cannot go lower than the default of 16k open files. That is a sane default what Icinga 2 requires at minimum to run and work.
Read-write. Defines the resource limit for RLIMIT_NOFILE that should be set at start-up. Value cannot be set lower than the default 16 * 1024. 0 disables the setting. Set in Icinga 2 sysconfig.
He is refering to ICINGA2_RLIMIT_FILES=50000
set in /etc/sysconfig/icinga2. It should raise the max open files to 50k but it is still at 16k, so icinga2 is ignoring it. I have the same problem on SLES, icinga2 ignores the settings. Even in older versions (2.7.2) it does not work to set RLimitFiles
in init.conf.
Thanks, @Mikesch-mp . Yes, the problem is that we want to increase the maximum number of open files to 50000, but the icinga2 processes ignore this change and stay at the default of 16 * 1024.
Ah ok, thanks, shouldn't comment here when I am tired after giving a training. Then I am out of ideas and one needs to reproduce the problem.
At the moment, our checks are getting more and more late due to the fixed ICINGA2_RLIMIT_FILES. On all checking systems, we are running into the file limit from time to time with service checks getting late as much as 10 minutes (at 5 minutes schedule). The check above was eventually executed with 9 minutes delay. Also, the icinga2 graphite writer module is not able to send the performance metrics to Graphite in this situation:
cat /var/log/icinga2/icinga2.log
[2018-04-12 16:54:50 +0200] critical/GraphiteWriter: Cannot write to TCP socket on host '127.0.0.1' port '2013'.
[2018-04-12 16:55:00 +0200] critical/GraphiteWriter: Cannot write to TCP socket on host '127.0.0.1' port '2013'.
[2018-04-12 16:55:09 +0200] critical/GraphiteWriter: Cannot write to TCP socket on host '127.0.0.1' port '2013'.
[2018-04-12 16:55:20 +0200] critical/GraphiteWriter: Cannot write to TCP socket on host '127.0.0.1' port '2013'.
This leads to large gaps in the Graphite performance graphs:
Here's a sample number of open files from an icinga2 satellite at the time the checks are late:
[root@xxxxxmo03 ~]# lsof |grep -c icinga
17771
Confirmed, it is a bug. Tested inside the Icinga Vagrant box standalone
.
[root@icinga2 ~]# grep -ri files /etc/sysconfig/icinga2
ICINGA2_RLIMIT_FILES=50000
[root@icinga2 ~]# systemctl restart icinga2
[root@icinga2 ~]# for p in $(pidof icinga2); do cat /proc/$p/limits | grep "Max open"; done
Max open files 16384 16384 files
Max open files 16384 16384 files
[root@icinga2 ~]# icinga2 console --connect 'https://root:icinga@localhost:5665/' --eval 'RLimitFiles'
16384.0
It seems that all changes that are done in /etc/sysconfig/icinga2
dont take affect. Even if I put random characters in there.. Nothing works.
[root@icinga2 ]# grep -i user /etc/sysconfig/icinga2
ICINGA2_USER=ici12345nga
[root@icinga2 ]# systemctl restart icinga2
[root@icinga2 ]# icinga2 variable get RunAsUser
icinga
[root@icinga2 ]# ps -ef | grep icinga2
icinga 13628 1 0 17:47 ? 00:00:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e $ICINGA2_LOG_DIR/error.log
icinga 13633 1 0 17:47 ? 00:00:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e $ICINGA2_LOG_DIR/error.log
We narrowed down the bug to the file not being read at all, i.e. the path in some instances is not set.
The path is compiled into the binary, and under specific circumstances an empty string. We've observed this with builds inside Docker and other variants. The patch requires a new tagged release including proper tests.
The referenced package fixes are not part of this ticket, only topic related for 2.8.3.
@Crunsher I've cherry-picked 11853cb36339920729bcaa5fcd461b5a288ba4cb for better logging into the coming PR for this issue. feature/rlimit-errno
can be deleted.
mbmif /usr/local/icinga2 (master *) # icinga2 daemon
[2018-04-19 10:07:12 +0200] warning/icinga-app: Sysconfig file '/usr/local/icinga2/etc/sysconfig/icinga2' cannot be read. Using default values.
[2018-04-19 10:07:12 +0200] information/cli: Icinga application loader (version: v2.8.2-637-g081988a0d; debug)
[root@icinga2-elastic ~]# vim /etc/sysconfig/icinga2
[root@icinga2-elastic ~]# systemctl restart icinga2
[root@icinga2-elastic ~]# for p in $(pidof icinga2); do cat /proc/$p/limits | grep "Max open"; done
Max open files 50000 50000 files
Max open files 50000 50000 files
[root@icinga2-elastic ~]# icinga2 console --connect 'https://root:icinga@localhost:5665/' --eval 'RLimitFiles'
50000.0
/etc/sysconfig
is not applicable for the Debian family (which uses /etc/default
for init script variables), the warnings cause users to file bugs like: Debian Bug #898703.
Ideally the sysconfig directory is not checked for the Debian family, or /etc/default
is checked instead.
We're dealing with this in #6255 scheduled for CW 21.
Expected Behavior
On RHEL/CentOS7, process limits as seen by systemctl show and cat /proc/PID/limits should provide consistent information.
Current Behavior
On RHEL7:
This is irritating, as the system administrator can't easily determine the limits in force for Icinga2. We've already had a discussion with RH support about this and they think that this may be related to the "--no-stack-rlimit" option passed on the command line.
Possible Solution
If Icinga2 sets its own limits (to 16384), this should be explicitly documented. Better, to make this setting configurable by the user.
Steps to Reproduce (for bugs)
Context
At the moment we are looking into an issue with checks that sometimes forward no performance metrics for Graphite. In order to rule out resource exhaustion, we are checking the configured limits of the Icinga2 processes.
Your Environment
icinga2 --version
): icinga2 - The Icinga 2 network monitoring daemon (version: r2.8.2-1)Copyright (c) 2012-2017 Icinga Development Team (https://www.icinga.com/) License GPLv2+: GNU GPL version 2 or later http://gnu.org/licenses/gpl2.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
Application information: Installation root: /usr Sysconf directory: /etc Run directory: /run Local state directory: /var Package data directory: /usr/share/icinga2 State path: /var/lib/icinga2/icinga2.state Modified attributes path: /var/lib/icinga2/modified-attributes.conf Objects path: /var/cache/icinga2/icinga2.debug Vars path: /var/cache/icinga2/icinga2.vars PID path: /run/icinga2/icinga2.pid
System information: Platform: Red Hat Enterprise Linux Server Platform version: 7.4 (Maipo) Kernel: Linux Kernel version: 3.10.0-693.17.1.el7.x86_64 Architecture: x86_64
Build information: Compiler: GNU 4.8.5 Build host: unknown
Enabled features (
icinga2 feature list
): Disabled features: command compatlog debuglog elasticsearch gelf graphite influxdb livestatus notification opentsdb perfdata statusdata syslog Enabled features: api checker mainlogConfig validation (
icinga2 daemon -C
): information/cli: Icinga application loader (version: r2.8.2-1) information/cli: Loading configuration file(s). information/ConfigItem: Committing config item(s). information/ApiListener: My API identity: osshplpmo06.xxxx.de warning/ApplyRule: Apply rule 'uptime' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 1:0-1:21) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'bacula_file_daemon' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 139:1-139:34) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'agent_icinga-core' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 148:1-148:33) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'available_volume_space_all' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 156:1-156:42) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'load' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 164:1-164:20) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'cluster-zone' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 172:1-172:28) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'icinga2_ido' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 180:1-180:27) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'cpu' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 253:1-253:19) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'ram' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 262:1-262:19) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'hardware-health' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 329:1-329:31) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'uptime' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 338:1-338:22) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'cpu' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 349:1-349:19) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'ram' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 360:1-360:19) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'interface_ethernet0/0_usage' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 371:1-371:43) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'interface_ethernet0/0_errors' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 384:1-384:44) for type 'Service' does not match anywhere! warning/ApplyRule: Apply rule 'interface_ethernet0/0_status' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 397:1-397:44) for type 'Service' does not match anywhere! information/ConfigItem: Instantiated 1 ApiListener. information/ConfigItem: Instantiated 4 Zones. information/ConfigItem: Instantiated 3 Endpoints. information/ConfigItem: Instantiated 1 FileLogger. information/ConfigItem: Instantiated 223 CheckCommands. information/ConfigItem: Instantiated 1 IcingaApplication. information/ConfigItem: Instantiated 1353 Hosts. information/ConfigItem: Instantiated 600 HostGroups. information/ConfigItem: Instantiated 2 Downtimes. information/ConfigItem: Instantiated 2 ServiceGroups. information/ConfigItem: Instantiated 5681 Services. information/ConfigItem: Instantiated 1 CheckerComponent. information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars' information/cli: Finished validating the configuration file(s).If you run multiple Icinga 2 instances, the
zones.conf
file (oricinga2 object list --type Endpoint
andicinga2 object list --type Zone
) from all affected nodes. Object 'osshplpmo06.xxxx.de' of type 'Endpoint': % declared in '/etc/icinga2/zones.conf', lines 28:1-28:41Object 'ossnplpmo03.xxxx.de' of type 'Endpoint': % declared in '/etc/icinga2/zones.conf', lines 11:1-11:41
Object 'osszplpmo02.xxxx.de' of type 'Endpoint': % declared in '/etc/icinga2/zones.conf', lines 6:1-6:41