Closed jeka-r closed 6 years ago
There is no nagios process here, both instances run Icinga 2. The client runs 2.10.1 and the master 2.9.0?
We have two servers with Icinga 2. Both are master, first r2.9.0-1 for test and second r2.10.1-1 as production. Same issue for both.
Api client starts on Windows 10 PC and connects to one of servers and collects StateChange
Please run curl with the most verbose mode you can get, and also share the location on disk. I'd like to know whether this originates from PS, native or is an extra installed binary. This likely creates a different request than the Linux counterparts.
Good day.
I have performed some tests and collected info.
curl version
C:\Users\Evgeniy>curl -V
curl 7.55.1 (Windows) libcurl/7.55.1 WinSSL
Release-Date: [unreleased]
Protocols: dict file ftp ftps http https imap imaps pop3 pop3s smtp smtps telnet tftp
Features: AsynchDNS IPv6 Largefile SSPI Kerberos SPNEGO NTLM SSL
curl in verbose mode, tested 31 sec and interrupted by ctrl+C
C:\Users\Evgeniy>curl -v -k -u ******:******* -H "Accept: application/json" -X POST "https://*******.com:5665/v1/events?queue=TestCurl&types=StateChange" > test-mm
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying ***.***.***.***...
* TCP_NODELAY set
* Connected to ims.is-profit.com (***.***.***.***) port 5665 (#0)
* schannel: SSL/TLS connection with *******.com port 5665 (step 1/3)
* schannel: disabled server certificate revocation checks
* schannel: verifyhost setting prevents Schannel from comparing the supplied target name with the subject names in server certificates.
* schannel: sending initial handshake data: sending 177 bytes...
* schannel: sent initial handshake data: sent 177 bytes
* schannel: SSL/TLS connection with *******.com port 5665 (step 2/3)
* schannel: failed to receive handshake, need more data
* schannel: SSL/TLS connection with *******.com port 5665 (step 2/3)
* schannel: encrypted data got 2645
* schannel: encrypted data buffer: offset 2645 length 4096
* schannel: a client certificate has been requested
* schannel: SSL/TLS connection with *******.com port 5665 (step 2/3)
* schannel: encrypted data buffer: offset 2645 length 4096
* schannel: sending next handshake data: sending 581 bytes...
* schannel: SSL/TLS connection with *******.com port 5665 (step 2/3)
* schannel: encrypted data got 258
* schannel: encrypted data buffer: offset 258 length 4096
* schannel: SSL/TLS handshake complete
* schannel: SSL/TLS connection with *******.com port 5665 (step 3/3)
* schannel: stored credential handle in session cache
* Server auth using Basic with user '*****************'
> POST /v1/events?queue=TestCurl&types=StateChange HTTP/1.1
> Host: *******.com:5665
> Authorization: Basic *********************************************************=
> User-Agent: curl/7.55.1
> Accept: application/json
>
* schannel: client wants to read 102400 bytes
* schannel: encdata_buffer resized 103424
* schannel: encrypted data buffer: offset 0 length 103424
* schannel: encrypted data got 46
* schannel: encrypted data buffer: offset 46 length 103424
* schannel: decrypted data length: 17
* schannel: decrypted data added: 17
* schannel: decrypted data cached: offset 17 length 102400
* schannel: encrypted data buffer: offset 0 length 103424
* schannel: decrypted data buffer: offset 17 length 102400
* schannel: schannel_recv cleanup
* schannel: decrypted data returned 17
* schannel: decrypted data buffer: offset 0 length 102400
< HTTP/1.1 200 OK
0 0 0 0 0 0 0 0 --:--:-- 0:00:31 --:--:-- 0^C
outputed test-mm is empty in current test, but some info from one of previous test
{"check_result":{"active":true,"check_source":"***********.com","command":["/usr/lib/nagios/plugins/check_apt"],"execution_end":1542100510.6113131046,"execution_start":1542100509.4333910942,"exit_status":0.0,"output":"APT OK: 0 packages available for upgrade (0 critical updates). ","performance_data":["available_upgrades=0;;;0","critical_updates=0;;;0"],"schedule_end":1542100510.6114239693,"schedule_start":1542100510.6114239693,"state":0.0,"ttl":0.0,"type":"CheckResult","vars_after":{"attempt":1.0,"reachable":true,"state":0.0,"state_type":1.0},"vars_before":{"attempt":1.0,"reachable":true,"state":2.0,"state_type":1.0}},"host":"*************.com","service":"apt","state":0.0,"state_type":1.0,"timestamp":1542100510.6188690662,"type":"StateChange"}
info from icinga log file during the test
[2018-11-13 13:48:13 +0300] information/ApiListener: New client connection from [***.***.***.***]:23902 (no client certificate)
[2018-11-13 13:48:13 +0300] information/HttpServerConnection: Request: POST /v1/events?queue=TestCurl&types=StateChange (from [***.***.***.***]:23902, user: ***************)
[2018-11-13 13:48:23 +0300] information/WorkQueue: #11 (HttpServerConnection) items: 1, rate: 0/s (0/min 0/5min 0/15min); empty in 17848 days, 10 hours, 48 minutes and 23 seconds
[2018-11-13 13:48:25 +0300] information/ExternalCommandListener: Executing external command: [1542106105] SCHEDULE_FORCED_SVC_CHECK;***********.com;apt;1542106105
[2018-11-13 13:48:33 +0300] information/WorkQueue: #11 (HttpServerConnection) items: 1, rate: 0/s (0/min 0/5min 0/15min); empty in -2147483648 days
[2018-11-13 13:48:34 +0300] information/ExternalCommandListener: Executing external command: [1542106114] SCHEDULE_FORCED_SVC_CHECK;***********.com;load;1542106114
[2018-11-13 13:48:38 +0300] information/ExternalCommandListener: Executing external command: [1542106118] SCHEDULE_FORCED_SVC_CHECK;***********.com;procs;1542106118
[2018-11-13 13:48:43 +0300] information/WorkQueue: #11 (HttpServerConnection) items: 1, rate: 0/s (0/min 0/5min 0/15min); empty in -2147483648 days
[2018-11-13 13:48:52 +0300] information/WorkQueue: #7 (IdoMysqlConnection, ido-mysql) items: 6, rate: 3.01667/s (181/min 881/5min 2241/15min);
[2018-11-13 13:48:53 +0300] information/WorkQueue: #11 (HttpServerConnection) items: 1, rate: 0/s (0/min 0/5min 0/15min); empty in -2147483648 days
[2018-11-13 13:48:58 +0300] information/Application: Received request to shut down.
[2018-11-13 13:48:59 +0300] information/Application: Shutting down...
[2018-11-13 13:48:59 +0300] information/ApiListener: 'api' stopped.
[2018-11-13 13:48:59 +0300] information/CheckerComponent: 'checker' stopped.
[2018-11-13 13:48:59 +0300] information/ExternalCommandListener: 'command' stopped.
[2018-11-13 13:48:59 +0300] information/ApiListener: 'api' started.
please inform if any test/info required
Yesterday, Icinga2 has been updated to r2.10.2-1 version in one of our servers. Problem no longer exists.
Second server, with version r2.9.0-1, problem is present.
Have any changes been made in the new version (r2.10.2-1) to the related issue?
Yes, definitely. Check the release blog posts and changelog. I assume this issue is now solved.
I want to note, a week later, the issue is no longer repeated.
Perhaps the problem was solved with the installation of the version r2.10.2-1, but I did not find mention of this issue in the changelog.
Many thanks for the support and help.
API using – disconnected API connection (from Windows 10 OS) causing high CPU usage by Icinga process (epoll_ctl/epoll_wait loop)
Expected Behavior
Normal CPU usage by Icinga process.
Current Behavior
Icinga process loads CPU high after API connection has been disconnected (from Windows 10 OS).
Possible Solution
n/a
Steps to Reproduce (for bugs)
curl -k -u ******:****** -H "Accept: application/json" -X POST "https://***************:5665/v1/events?queue=TestCurl&types=StateChange"
Context
We use windows application as API client and found that when we use the application, after 10 – 120 minutes the server CPU shows high load by Nagios process
Example from "top -u nagios -H"
Example strace from PID 4756:
There is no CPU load if curl command started and terminated from Ubuntu PC
Your Environment
Version used: r2.9.0-1 (and r2.10.1-1)
Operating System and version: Platform: Ubuntu Platform version: 16.04.5 LTS (Xenial Xerus) Kernel: Linux Kernel version: 4.4.0-138-generic Architecture: x86_64
Enabled features: Disabled features: compatlog debuglog elasticsearch gelf graphite influxdb livestatus opentsdb perfdata statusdata syslog Enabled features: api checker command ido-mysql mainlog notification
Icinga Web 2 version and modules: 2.5.3
Config validation (
icinga2 daemon -C
): [2018-11-09 17:05:23 +0300] information/cli: Icinga application loader (version: r2.9.0-1) [2018-11-09 17:05:23 +0300] information/cli: Loading configuration file(s). [2018-11-09 17:05:23 +0300] information/ConfigItem: Committing config item(s). [2018-11-09 17:05:23 +0300] information/ApiListener: My API identity: ** [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 27 Services. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 1 FileLogger. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 1 NotificationComponent. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 1 IcingaApplication. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 3 Hosts. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 1 ApiListener. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 1 CheckerComponent. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 5 Zones. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 1 ExternalCommandListener. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 3 Endpoints. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 4 ApiUsers. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 1 IdoMysqlConnection. [2018-11-09 17:05:23 +0300] information/ConfigItem: Instantiated 213 CheckCommands. [2018-11-09 17:05:23 +0300] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars' [2018-11-09 17:05:23 +0300] information/cli: Finished validating the configuration file(s).