mickem / nscp

NSClient++
http://nsclient.org
GNU General Public License v2.0
244 stars 94 forks source link

check_process is unexpectedly case-sensitive #587

Open Daniel-Beardsmore opened 6 years ago

Daniel-Beardsmore commented 6 years ago

When using check_process with process=… it requires the process name to use the same case as the filter.

For example, try process=notepad.exe — this works if you opened notepad with Start → Run → notepad.exe, but if you opened a .txt file and Notepad was started via file association, Notepad's command line contains NOTEPAD.EXE (in capitals) and the process doesn't get picked up. Use process=NOTEPAD.EXE and NSClient++ now finds it. Or compare process=explorer.exe vs process=Explorer.exe to see the same effect.

(Notepad is an unlikely real-world use case, but I was using it as an easy test when investigating csrss.exe not being picked up, and got very confused as to how notepad.exe was being missed! Notepad's open verb uses "%SystemRoot%\system32\NOTEPAD.EXE %1".)

Tested in 0.5.2.39, Server 2012 R2.

tferic commented 5 years ago

We are suffering from the same issue, and it's actually a real problem for us.
When our application starts up as a Windows Service, the processes will start with a specific case in the process name. We have defined the process names with these exact cases in the Nagios server check.
If an administrator decides to restart a subtask manually, the same process will start with a different case, falsely triggering an alert.
In other words, the process is running, and check_process falsely detects that it is not running.

Example for nagios server-side check (OK):
check_nrpe -H '$HOSTADDRESS$' --command check_process --args process=nserver.exe process=nAdminp.EXE process=namgr.EXE process=nReplica.EXE OK: all processes are ok.|'nserver.exe state'=1;0;0 'nReplica.EXE state'=1;0;0 'nAdminp.EXE state'=1;0;0 'namgr.EXE state'=1;0;0 'namgr.EXE state'=1;0;0 'count'=5;0;0

Example for nagios server-side check (false-negative):
check_nrpe -H '$HOSTADDRESS$' --command check_process --args process=nserver.exe process=nAdminp.EXE process=namgr.EXE process=nReplica.EXE CRITICAL: nAdminp.EXE=stopped, nReplica.EXE=stopped|'nserver.exe state'=1;0;0 'namgr.EXE state'=1;0;0 'namgr.EXE state'=1;0;0 'namgr.EXE state'=1;0;0 'nAdminp.EXE state'=0;0;0 'nReplica.EXE state'=0;0;0 'count'=6;0;0

Please notice, that in the latter case (false-negative), the processes are indeed running, but with different case:
`C:\Program Files\NSClient++>tasklist /FI "IMAGENAME eq nAdminp.EXE"

Image Name PID Session Name Session# Mem Usage ========================= ======== ================ =========== ============ nadminp.exe 10720 Services 0 85'848 K

C:\Program Files\NSClient++>tasklist /FI "IMAGENAME eq nReplica.EXE"

Image Name PID Session Name Session# Mem Usage ========================= ======== ================ =========== ============ nreplica.exe 9324 Services 0 143'052 K`

So the check_process falsely claims that the process is not running, when it is actually running. I believe that check_process should ignore case sensitivity in process names.

tferic commented 5 years ago

I wrote this nagios check module to temporarily work around the issue: https://github.com/tferic/nagios-check_process2

lunnonco commented 4 years ago

I would also note that the case varies by OS version. For example explorer.exe works in Windows 10 and not in Windows 7 which needs Explorer.EXE to have the check succeed.

hecko commented 4 years ago

Would be absolutely great to have the option to disable the case sensitivity in the check_process.