NagiosEnterprises / ncpa

Nagios Cross-Platform Agent
Other
177 stars 95 forks source link

NCPA 2.3.1 Solaris Sparc (your OS version details, etc) processes with capital letters not able to be found) #826

Open Elcom02 opened 2 years ago

Elcom02 commented 2 years ago

Downloaded ncpa-2.3.1.sol11.sparc.pkg, installed on SunOS sc1dvdb02 5.11 11.4.34.94.4 sun4v sparc sun4v server. The server is running Oracle 19c database. Try to monitor the process of "asmpmon+ASM" and "ora_pmon_SCDVPE01", Here the test from XI server: [nagios@sc1psnxi01.elcom.com ~]$ /usr/local/nagios/libexec/check_ncpa.py -H sc1dvdb02.elcom.com -t 'XXXX' -P 5693 -M 'processes' -q 'name=asmpmon+ASM' -w 1 -c 2 OK: Process count for processes named asmpmon+asm was 0 | 'process_count'=0;1;2; and [nagios@sc1psnxi01 t981910]$ /usr/local/nagios/libexec/check_ncpa.py -H sc1dvdb02.elcom.com -t 'XXXX' -P 5693 -M 'processes' -q 'name=ora_pmon_SCDVPE01' -w 1 -c 2 OK: Process count for processes named ora_pmon_scdvpe01 was 0 | 'process_count'=0;1;2;

The Process count should be 1 but it marked as 0; so even when the real Oracle DB is shutdown the above ncpa check still said OK.

MrPippin66 commented 2 years ago

I don't think you understand the range values.

In your case, it will do a crit if the number of matching process is > 2 and a warn if the number of matching processes is > 1.

https://nagios-plugins.org/doc/guidelines.html#THRESHOLDFORMAT

ccztux commented 2 years ago

I think the issue is that NCPA reports that zero processes were found which is caused by the capital letters in the proces names. Right?

MrPippin66 commented 2 years ago

@ccztux @Elcom02 I've encountered a similar issue which I submitted an issue for, and I think will result in a feature request.

I think the core issue (other than what I originally stated regarding the warn/crit filters, is that on Linux, the "name" may not be what "ps" is showing. I suggest to use "cmd" instead of "name".

If you're depending on the full output from the check, it doesn't include the "cmd" name, just "name" name.

I'd validate the same situation isn't occurring here, despite this being Solaris.