Open andy-e-payne opened 1 year ago
Hey @andy-e-payne: I'm taking a look at your issue. I have a RHEL 8 VM running NCPA v2.4.1, and I'm querying it using check_ncpa.py with your arguments. To expedite things, I'm checking once every 2 seconds, vs 5 minutes. I'm not seeing a noticeable change in memory usage for ncpa_listener.
Does this sound like the correct set-up?
I'm using version 1.2.4 of check_ncpa.py. What version are you using?
Thanks, Phred
I did not get a notify or i missed this at the time ! yes using v 1.2.4 of check_ncpa.py
I have tried to replicate this using RHEL 8 with both NCPA 3.1.0 and NCPA 2.3.1 and have been unable to replicate this.
Is this just occurring for you on your one machine or is it happening on multiple machines?
Have you tried NCPA 3?
Can you try running top
to see what is running and taking up so much memory?
If I don't get a response in the next few weeks, I will close this issue as solved.
Hi,
its the ncpa_listener process itself that was consuming the most memory.
at every poll the increase was 70-100 KB so it needed alot of polls to show up as significant change, but was first highlighted when a system got into problems and the ncpa_listener had over 6GB
The increase was also showing using the API via a connection using http://
Can you set loglevel = debug
in ncpa.cfg
, restart NCPA and see if there's anything notable in the listener log (/usr/local/ncpa/var/log/ncpa_listener.log
)? I am still unable to see any difference in memory usage after running check_ncpa.py 10,000 times.
Is there anything unusual about your NCPA or network configuration?
I get increase at every check, even if i use the web API Access, visible there so id expect its not the network in the Real Sense.
Nothing unusual in the network im aware of, but i can only see the source and destination RedHat servers the source of the check is a RHEL8 system, the destination is RHEL7
106228: ncpa: nagios: 0.58 % (VMS 283680.77 KB, RSS 47452.16 KB): 0.03 % 106232: ncpa: nagios: 0.52 % (VMS 286384.13 KB, RSS 42352.64 KB): 0.00 % 106233: ncpa: nagios: 0.9 % (VMS 389746.69 KB, RSS 73674.75 KB): 0.23 % Total Memory: 2.00 % (VMS 959811.59 KB, RSS 163479.55 KB) Total CPU: 0.26 %
$ cat /usr/local/ncpa/var/log/ncpa_listener.log
2024-08-22 12:54:39,128 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/bytes_recv/?token=********&units=M&warning=10&critical=100&delta=True&check=1 2024-08-22 12:54:39,137 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:54:39] "GET /api/interface/ens192/bytes_recv/?token=****&units=M&warning=10&critical=100&delta=True&check=1 HTTP/1.1" 200 416 0.009796 2024-08-22 12:54:45,653 listener INFO before_request() - request.url: https://ncpadevice:5693/api/services/?token=********&check=1&service=ncpa_listener&status=running 2024-08-22 12:54:45,692 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:54:45] "GET /api/services/?token=****&check=1&service=ncpa_listener&status=running HTTP/1.1" 200 419 0.039198 2024-08-22 12:54:49,394 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|var/?token=********&warning=95&critical=95&check=1 2024-08-22 12:54:49,417 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:54:49] "GET /api/disk/logical/%7Cvar/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 499 0.023542 2024-08-22 12:54:51,985 listener INFO before_request() - request.url: https://ncpadevice:5693/api/services/?token=********&check=1&service=ncpa_listener&status=running 2024-08-22 12:54:51,994 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|opt/?token=********&warning=95&critical=95&check=1 2024-08-22 12:54:52,006 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:54:52] "GET /api/disk/logical/%7Copt/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 499 0.013294 2024-08-22 12:54:52,021 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|var/?token=********&warning=95&critical=95&check=1 2024-08-22 12:54:52,032 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:54:52] "GET /api/disk/logical/%7Cvar/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 499 0.012471 2024-08-22 12:54:52,040 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:54:52] "GET /api/services/?token=****&check=1&service=ncpa_listener&status=running HTTP/1.1" 200 419 0.055737 2024-08-22 12:55:05,342 listener INFO before_request() - request.url: https://ncpadevice:5693/api/cpu/percent/?token=********&critical=100&check=1&aggregate=avg 2024-08-22 12:55:05,857 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:55:05] "GET /api/cpu/percent/?token=****&critical=100&check=1&aggregate=avg HTTP/1.1" 200 406 0.514557 2024-08-22 12:55:17,152 listener INFO before_request() - request.url: https://ncpadevice:5693/api/processes/?token=********&units=K&check=1&name=ncpa&status=running&memory=5 2024-08-22 12:55:17,296 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:55:17] "GET /api/processes/?token=****&units=K&check=1&name=ncpa&status=running&memory=5 HTTP/1.1" 200 917 0.143598 2024-08-22 12:55:19,599 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/bytes_recv/?token=********&units=K&warning=10000&critical=100000&delta=True&check=1 2024-08-22 12:55:19,610 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:55:19] "GET /api/interface/ens192/bytes_recv/?token=****&units=K&warning=10000&critical=100000&delta=True&check=1 HTTP/1.1" 200 422 0.011429 2024-08-22 12:55:20,094 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/?token=********&units=M&warning=10&critical=100&delta=True&check=1 2024-08-22 12:55:20,106 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:55:20] "GET /api/interface/ens192/?token=****&units=M&warning=10&critical=100&delta=True&check=1 HTTP/1.1" 200 734 0.012141 2024-08-22 12:55:32,882 listener INFO before_request() - request.url: https://ncpadevice:5693/api/processes/?token=********&units=K&check=1&name=ncpa&status=running&memory=5 2024-08-22 12:55:33,017 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:55:33] "GET /api/processes/?token=****&units=K&check=1&name=ncpa&status=running&memory=5 HTTP/1.1" 200 926 0.134640 2024-08-22 12:55:48,468 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|tmp/?token=********&warning=95&critical=95&check=1 2024-08-22 12:55:48,479 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:55:48] "GET /api/disk/logical/%7Ctmp/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 498 0.011936 2024-08-22 12:56:01,055 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|var|log|audit/?token=********&warning=95&critical=95&check=1 2024-08-22 12:56:01,067 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:56:01] "GET /api/disk/logical/%7Cvar%7Clog%7Caudit/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 498 0.012172 2024-08-22 12:56:06,568 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|/?token=********&warning=95&critical=95&check=1 2024-08-22 12:56:06,578 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:56:06] "GET /api/disk/logical/%7C/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 503 0.010657 2024-08-22 12:56:13,799 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|tmp/?token=********&warning=95&critical=95&check=1 2024-08-22 12:56:13,809 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:56:13] "GET /api/disk/logical/%7Ctmp/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 498 0.010536 2024-08-22 12:56:39,391 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/?token=********&units=M&warning=10&critical=100&delta=True&check=1 2024-08-22 12:56:39,402 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:56:39] "GET /api/interface/ens192/?token=****&units=M&warning=10&critical=100&delta=True&check=1 HTTP/1.1" 200 682 0.011011 2024-08-22 12:56:41,219 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|home/?token=********&critical=95&check=1 2024-08-22 12:56:41,232 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:56:41] "GET /api/disk/logical/%7Chome/?token=****&critical=95&check=1 HTTP/1.1" 200 498 0.012673 2024-08-22 12:56:58,428 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|tmp/?token=********&warning=95&critical=95&check=1 2024-08-22 12:56:58,440 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:56:58] "GET /api/disk/logical/%7Ctmp/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 498 0.012614 2024-08-22 12:57:12,560 listener INFO before_request() - request.url: https://ncpadevice:5693/api/services/?token=********&check=1&service=ncpa_listener&status=running 2024-08-22 12:57:12,605 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:57:12] "GET /api/services/?token=****&check=1&service=ncpa_listener&status=running HTTP/1.1" 200 419 0.045173 2024-08-22 12:57:12,857 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/?token=********&units=M&warning=10&critical=100&delta=True&check=1 2024-08-22 12:57:12,871 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:57:12] "GET /api/interface/ens192/?token=****&units=M&warning=10&critical=100&delta=True&check=1 HTTP/1.1" 200 734 0.014087
$ cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.9 (Maipo)
the following check causes the memory of ncpa_listener process to increase, it dose not drop after completion, /usr/local/nagios/libexec/check_ncpa.py -H REDACTED -t TOKEN -P 5693 -M 'processes' -q 'name=ncpa_listener,status=running' -c 1: -u K
this causes, with 5 min polls upto half a GB of memory added each day .7*288
sofar its only come to light on RHEL 6,7,8 systems, windows and Solaris seem unaffected, this came to light on v2.3.1, but the first thing i did was upgrade to the latest, v2.4.1 , and the problem remains. The issue also arises if you use the api in a browser - https://redacted:5693/gui/api as the attached