NagiosEnterprises / ncpa

Nagios Cross-Platform Agent
Other
177 stars 95 forks source link

ncpa_listener process on RHEL memory foot print increases by 700KB-1000KB every process poll #947

Open andy-e-payne opened 1 year ago

andy-e-payne commented 1 year ago

the following check causes the memory of ncpa_listener process to increase, it dose not drop after completion, /usr/local/nagios/libexec/check_ncpa.py -H REDACTED -t TOKEN -P 5693 -M 'processes' -q 'name=ncpa_listener,status=running' -c 1: -u K
this causes, with 5 min polls upto half a GB of memory added each day .7*288
sofar its only come to light on RHEL 6,7,8 systems, windows and Solaris seem unaffected, this came to light on v2.3.1, but the first thing i did was upgrade to the latest, v2.4.1 , and the problem remains. The issue also arises if you use the api in a browser - https://redacted:5693/gui/api as the attached nagiosAPI Screenshot 2023-05-10 ncpa_listener

phreditorNG commented 1 year ago

Hey @andy-e-payne: I'm taking a look at your issue. I have a RHEL 8 VM running NCPA v2.4.1, and I'm querying it using check_ncpa.py with your arguments. To expedite things, I'm checking once every 2 seconds, vs 5 minutes. I'm not seeing a noticeable change in memory usage for ncpa_listener.

Does this sound like the correct set-up?

I'm using version 1.2.4 of check_ncpa.py. What version are you using?

Thanks, Phred

andy-e-payne commented 2 months ago

I did not get a notify or i missed this at the time ! yes using v 1.2.4 of check_ncpa.py

ne-bbahn commented 2 months ago

I have tried to replicate this using RHEL 8 with both NCPA 3.1.0 and NCPA 2.3.1 and have been unable to replicate this. Is this just occurring for you on your one machine or is it happening on multiple machines? Have you tried NCPA 3? Can you try running top to see what is running and taking up so much memory?

If I don't get a response in the next few weeks, I will close this issue as solved.

andy-e-payne commented 2 months ago

Hi, its the ncpa_listener process itself that was consuming the most memory. at every poll the increase was 70-100 KB so it needed alot of polls to show up as significant change, but was first highlighted when a system got into problems and the ncpa_listener had over 6GB The increase was also showing using the API via a connection using http://:5693/gui/api

ne-bbahn commented 2 months ago

Can you set loglevel = debug in ncpa.cfg, restart NCPA and see if there's anything notable in the listener log (/usr/local/ncpa/var/log/ncpa_listener.log)? I am still unable to see any difference in memory usage after running check_ncpa.py 10,000 times. Is there anything unusual about your NCPA or network configuration?

andy-e-payne commented 1 month ago

I get increase at every check, even if i use the web API Access, visible there so id expect its not the network in the Real Sense.

Nothing unusual in the network im aware of, but i can only see the source and destination RedHat servers the source of the check is a RHEL8 system, the destination is RHEL7

$ /usr/local/nagios/libexec/check_ncpa.py -T 20 -H ncpadevice -t **** -P 5693 -M 'processes' -q 'name=ncpa,status=running,memory=5' -u K OK: Process count for processes named ncpa was 3 | 'process_count'=3;;; 'cpu'=4.01%;;; 'memory'=1.9700000000000002%;;; 'memory_vms'=957341.7KB;;; 'memory_rss'=161017.86KB;;; Processes Matched PID: Name: Username: Exe: Memory: CPU

106228: ncpa: nagios: 0.58 % (VMS 283680.77 KB, RSS 47452.16 KB): 0.00 % 106232: ncpa: nagios: 0.52 % (VMS 286384.13 KB, RSS 42352.64 KB): 0.28 % 106233: ncpa: nagios: 0.87 % (VMS 387276.80 KB, RSS 71213.06 KB): 3.73 % Total Memory: 1.97 % (VMS 957341.70 KB, RSS 161017.86 KB) Total CPU: 4.01 % $ /usr/local/nagios/libexec/check_ncpa.py -T 20 -H ncpadevice -t **** -P 5693 -M 'processes' -q 'name=ncpa,status=running,memory=5' -u K OK: Process count for processes named ncpa was 3 | 'process_count'=3;;; 'cpu'=0.1%;;; 'memory'=2.0%;;; 'memory_vms'=959377.41KB;;; 'memory_rss'=163151.87KB;;; Processes Matched PID: Name: Username: Exe: Memory: CPU

106228: ncpa: nagios: 0.58 % (VMS 283680.77 KB, RSS 47452.16 KB): 0.00 % 106232: ncpa: nagios: 0.52 % (VMS 286384.13 KB, RSS 42352.64 KB): 0.00 % 106233: ncpa: nagios: 0.9 % (VMS 389312.51 KB, RSS 73347.07 KB): 0.10 % Total Memory: 2.00 % (VMS 959377.41 KB, RSS 163151.87 KB) Total CPU: 0.10 % $ /usr/local/nagios/libexec/check_ncpa.py -T 20 -H ncpadevice -t **** -P 5693 -M 'processes' -q 'name=ncpa,status=running,memory=5' -u K OK: Process count for processes named ncpa was 3 | 'process_count'=3;;; 'cpu'=0.26%;;; 'memory'=2.0%;;; 'memory_vms'=959811.5900000001KB;;; 'memory_rss'=163479.55KB;;; Processes Matched PID: Name: Username: Exe: Memory: CPU

106228: ncpa: nagios: 0.58 % (VMS 283680.77 KB, RSS 47452.16 KB): 0.03 % 106232: ncpa: nagios: 0.52 % (VMS 286384.13 KB, RSS 42352.64 KB): 0.00 % 106233: ncpa: nagios: 0.9 % (VMS 389746.69 KB, RSS 73674.75 KB): 0.23 % Total Memory: 2.00 % (VMS 959811.59 KB, RSS 163479.55 KB) Total CPU: 0.26 %

$ cat /usr/local/ncpa/var/log/ncpa_listener.log

2024-08-22 12:54:39,128 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/bytes_recv/?token=********&units=M&warning=10&critical=100&delta=True&check=1 2024-08-22 12:54:39,137 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:54:39] "GET /api/interface/ens192/bytes_recv/?token=****&units=M&warning=10&critical=100&delta=True&check=1 HTTP/1.1" 200 416 0.009796 2024-08-22 12:54:45,653 listener INFO before_request() - request.url: https://ncpadevice:5693/api/services/?token=********&check=1&service=ncpa_listener&status=running 2024-08-22 12:54:45,692 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:54:45] "GET /api/services/?token=****&check=1&service=ncpa_listener&status=running HTTP/1.1" 200 419 0.039198 2024-08-22 12:54:49,394 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|var/?token=********&warning=95&critical=95&check=1 2024-08-22 12:54:49,417 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:54:49] "GET /api/disk/logical/%7Cvar/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 499 0.023542 2024-08-22 12:54:51,985 listener INFO before_request() - request.url: https://ncpadevice:5693/api/services/?token=********&check=1&service=ncpa_listener&status=running 2024-08-22 12:54:51,994 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|opt/?token=********&warning=95&critical=95&check=1 2024-08-22 12:54:52,006 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:54:52] "GET /api/disk/logical/%7Copt/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 499 0.013294 2024-08-22 12:54:52,021 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|var/?token=********&warning=95&critical=95&check=1 2024-08-22 12:54:52,032 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:54:52] "GET /api/disk/logical/%7Cvar/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 499 0.012471 2024-08-22 12:54:52,040 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:54:52] "GET /api/services/?token=****&check=1&service=ncpa_listener&status=running HTTP/1.1" 200 419 0.055737 2024-08-22 12:55:05,342 listener INFO before_request() - request.url: https://ncpadevice:5693/api/cpu/percent/?token=********&critical=100&check=1&aggregate=avg 2024-08-22 12:55:05,857 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:55:05] "GET /api/cpu/percent/?token=****&critical=100&check=1&aggregate=avg HTTP/1.1" 200 406 0.514557 2024-08-22 12:55:17,152 listener INFO before_request() - request.url: https://ncpadevice:5693/api/processes/?token=********&units=K&check=1&name=ncpa&status=running&memory=5 2024-08-22 12:55:17,296 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:55:17] "GET /api/processes/?token=****&units=K&check=1&name=ncpa&status=running&memory=5 HTTP/1.1" 200 917 0.143598 2024-08-22 12:55:19,599 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/bytes_recv/?token=********&units=K&warning=10000&critical=100000&delta=True&check=1 2024-08-22 12:55:19,610 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:55:19] "GET /api/interface/ens192/bytes_recv/?token=****&units=K&warning=10000&critical=100000&delta=True&check=1 HTTP/1.1" 200 422 0.011429 2024-08-22 12:55:20,094 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/?token=********&units=M&warning=10&critical=100&delta=True&check=1 2024-08-22 12:55:20,106 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:55:20] "GET /api/interface/ens192/?token=****&units=M&warning=10&critical=100&delta=True&check=1 HTTP/1.1" 200 734 0.012141 2024-08-22 12:55:32,882 listener INFO before_request() - request.url: https://ncpadevice:5693/api/processes/?token=********&units=K&check=1&name=ncpa&status=running&memory=5 2024-08-22 12:55:33,017 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:55:33] "GET /api/processes/?token=****&units=K&check=1&name=ncpa&status=running&memory=5 HTTP/1.1" 200 926 0.134640 2024-08-22 12:55:48,468 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|tmp/?token=********&warning=95&critical=95&check=1 2024-08-22 12:55:48,479 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:55:48] "GET /api/disk/logical/%7Ctmp/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 498 0.011936 2024-08-22 12:56:01,055 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|var|log|audit/?token=********&warning=95&critical=95&check=1 2024-08-22 12:56:01,067 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:56:01] "GET /api/disk/logical/%7Cvar%7Clog%7Caudit/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 498 0.012172 2024-08-22 12:56:06,568 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|/?token=********&warning=95&critical=95&check=1 2024-08-22 12:56:06,578 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:56:06] "GET /api/disk/logical/%7C/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 503 0.010657 2024-08-22 12:56:13,799 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|tmp/?token=********&warning=95&critical=95&check=1 2024-08-22 12:56:13,809 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:56:13] "GET /api/disk/logical/%7Ctmp/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 498 0.010536 2024-08-22 12:56:39,391 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/?token=********&units=M&warning=10&critical=100&delta=True&check=1 2024-08-22 12:56:39,402 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:56:39] "GET /api/interface/ens192/?token=****&units=M&warning=10&critical=100&delta=True&check=1 HTTP/1.1" 200 682 0.011011 2024-08-22 12:56:41,219 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|home/?token=********&critical=95&check=1 2024-08-22 12:56:41,232 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:56:41] "GET /api/disk/logical/%7Chome/?token=****&critical=95&check=1 HTTP/1.1" 200 498 0.012673 2024-08-22 12:56:58,428 listener INFO before_request() - request.url: https://ncpadevice:5693/api/disk/logical/|tmp/?token=********&warning=95&critical=95&check=1 2024-08-22 12:56:58,440 geventwebsocket.handler INFO ::ffff:yyy.yyy.yyy.yyy - - [2024-08-22 12:56:58] "GET /api/disk/logical/%7Ctmp/?token=****&warning=95&critical=95&check=1 HTTP/1.1" 200 498 0.012614 2024-08-22 12:57:12,560 listener INFO before_request() - request.url: https://ncpadevice:5693/api/services/?token=********&check=1&service=ncpa_listener&status=running 2024-08-22 12:57:12,605 geventwebsocket.handler INFO ::ffff:zzz.zzz.zzz.zzz - - [2024-08-22 12:57:12] "GET /api/services/?token=****&check=1&service=ncpa_listener&status=running HTTP/1.1" 200 419 0.045173 2024-08-22 12:57:12,857 listener INFO before_request() - request.url: https://ncpadevice:5693/api/interface/ens192/?token=********&units=M&warning=10&critical=100&delta=True&check=1 2024-08-22 12:57:12,871 geventwebsocket.handler INFO ::ffff:xxx.xxx.xxx.xxx - - [2024-08-22 12:57:12] "GET /api/interface/ens192/?token=****&units=M&warning=10&critical=100&delta=True&check=1 HTTP/1.1" 200 734 0.014087

andy-e-payne commented 1 month ago

$ cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.9 (Maipo)

andy-e-payne commented 1 month ago

eb2b0408-3854-48ed-a7ef-ebcfd5050b17