Open Daniel-Beardsmore opened 6 years ago
We are suffering from the same issue, and it's actually a real problem for us.
When our application starts up as a Windows Service, the processes will start with a specific case in the process name. We have defined the process names with these exact cases in the Nagios server check.
If an administrator decides to restart a subtask manually, the same process will start with a different case, falsely triggering an alert.
In other words, the process is running, and check_process falsely detects that it is not running.
Example for nagios server-side check (OK):
check_nrpe -H '$HOSTADDRESS$' --command check_process --args process=nserver.exe process=nAdminp.EXE process=namgr.EXE process=nReplica.EXE OK: all processes are ok.|'nserver.exe state'=1;0;0 'nReplica.EXE state'=1;0;0 'nAdminp.EXE state'=1;0;0 'namgr.EXE state'=1;0;0 'namgr.EXE state'=1;0;0 'count'=5;0;0
Example for nagios server-side check (false-negative):
check_nrpe -H '$HOSTADDRESS$' --command check_process --args process=nserver.exe process=nAdminp.EXE process=namgr.EXE process=nReplica.EXE CRITICAL: nAdminp.EXE=stopped, nReplica.EXE=stopped|'nserver.exe state'=1;0;0 'namgr.EXE state'=1;0;0 'namgr.EXE state'=1;0;0 'namgr.EXE state'=1;0;0 'nAdminp.EXE state'=0;0;0 'nReplica.EXE state'=0;0;0 'count'=6;0;0
Please notice, that in the latter case (false-negative), the processes are indeed running, but with different case:
`C:\Program Files\NSClient++>tasklist /FI "IMAGENAME eq nAdminp.EXE"
Image Name PID Session Name Session# Mem Usage ========================= ======== ================ =========== ============ nadminp.exe 10720 Services 0 85'848 K
C:\Program Files\NSClient++>tasklist /FI "IMAGENAME eq nReplica.EXE"
Image Name PID Session Name Session# Mem Usage ========================= ======== ================ =========== ============ nreplica.exe 9324 Services 0 143'052 K`
So the check_process falsely claims that the process is not running, when it is actually running. I believe that check_process should ignore case sensitivity in process names.
I wrote this nagios check module to temporarily work around the issue: https://github.com/tferic/nagios-check_process2
I would also note that the case varies by OS version. For example explorer.exe works in Windows 10 and not in Windows 7 which needs Explorer.EXE to have the check succeed.
Would be absolutely great to have the option to disable the case sensitivity in the check_process.
When using check_process with process=… it requires the process name to use the same case as the filter.
For example, try process=notepad.exe — this works if you opened notepad with Start → Run → notepad.exe, but if you opened a .txt file and Notepad was started via file association, Notepad's command line contains NOTEPAD.EXE (in capitals) and the process doesn't get picked up. Use process=NOTEPAD.EXE and NSClient++ now finds it. Or compare process=explorer.exe vs process=Explorer.exe to see the same effect.
(Notepad is an unlikely real-world use case, but I was using it as an easy test when investigating csrss.exe not being picked up, and got very confused as to how notepad.exe was being missed! Notepad's open verb uses "%SystemRoot%\system32\NOTEPAD.EXE %1".)
Tested in 0.5.2.39, Server 2012 R2.