sensu-plugins / sensu-plugins-process-checks

This plugin provides native process instrumentation for monitoring and metrics collection, including: process status, uptime, thread count, and others.
http://sensu-plugins.io
MIT License
20 stars 55 forks source link

Can't verify status of systemctl services, it's just used to verify processes? #37

Closed dancb10 closed 7 years ago

dancb10 commented 7 years ago

Hello, We have custom processes that can be viewed only by using systemctl status process. In process explorer the name changes or it's not shown at all since it runs under a ruby or python. How can this plugin be used to verify the status of a service

eheydrick commented 7 years ago

Hi @dancb10, this plugin is agnostic to the init system that started the process. What you could do is get the full process name from ps -ef and then use check-process.rb's -p flag to match the arguments on the process. Another approach would be to have a check that ran systemctl is-active theservice. That'll return 0 if it's running and a non-0 if it's not.

dancb10 commented 7 years ago

Yes but for example iptables which runs and loads into the net kernel module doesn't show up in the processes list

eheydrick commented 7 years ago

Right so for that case you'd want to check the loaded kernel modules (e.g. modprobe -q module).

dancb10 commented 7 years ago

I would simply like to have the result of systemctl status iptables showing up in sensu. I managed to achieve that by taking a nagios plugin and using it directly in sensu

GhostLyrics commented 7 years ago

Pretty sure you could actually just run systemctl status iptables as a check, no? Without a wrapper it seems to return 3 for "not running", which is "status unknown" in sensu.

root@machine:~# systemctl stop puppet
root@machine:~# systemctl status puppet.service
● puppet.service - Puppet agent
   Loaded: loaded (/lib/systemd/system/puppet.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Thu 2017-02-09 16:45:14 CET; 5s ago
  Process: 798 ExecStart=/usr/bin/puppet agent $DAEMON_OPTS (code=exited, status=0/SUCCESS)
 Main PID: 1824 (code=exited, status=0/SUCCESS)

Feb 09 16:14:44 machine puppet-agent[7310]: Finished catalog run in 4.20 seconds
Feb 09 16:19:44 machine puppet-agent[9908]: Finished catalog run in 4.25 seconds
Feb 09 16:24:44 machine puppet-agent[12500]: Finished catalog run in 4.11 seconds
Feb 09 16:29:44 machine puppet-agent[15099]: Finished catalog run in 4.00 seconds
Feb 09 16:34:44 machine puppet-agent[17690]: Finished catalog run in 4.10 seconds
Feb 09 16:39:44 machine puppet-agent[20285]: Finished catalog run in 4.08 seconds
Feb 09 16:44:44 machine puppet-agent[22962]: Finished catalog run in 3.97 seconds
Feb 09 16:45:14 machine systemd[1]: Stopping Puppet agent...
Feb 09 16:45:14 machine puppet-agent[1824]: Caught TERM; exiting
Feb 09 16:45:14 machine systemd[1]: Stopped Puppet agent.
root@machine:~# echo $?
3
root@machine:~# systemctl start puppet.service
root@machine:~# echo $?
0

However, having a dedicated check for services (systemd, etc) might be helpful.

eheydrick commented 7 years ago

Glad you were able to solve it. It would be nice to have a check for systemd and other init systems.