joonty / systemd_mon

Monitor for systemd to alert failed services
MIT License
106 stars 28 forks source link

oneshot services incorrectly reported as `still failed` #13

Open ftc2 opened 6 years ago

ftc2 commented 6 years ago

i see that oneshot services are supported: https://github.com/joonty/systemd_mon/pull/3 but it's not working well for me.

for example, i have the certbot.service unit added to the config. this is a oneshot service running on a timer to manage ssl cert renewal.

$ sudo systemctl status certbot.service
● certbot.service - Certbot
   Loaded: loaded (/lib/systemd/system/certbot.service; static; vendor preset: enabled)
   Active: inactive (dead) since Mon 2018-02-12 12:45:13 CST; 50s ago
     Docs: file:///usr/share/doc/python-certbot-doc/html/index.html
           https://letsencrypt.readthedocs.io/en/latest/
  Process: 17261 ExecStart=/usr/bin/certbot -q renew (code=exited, status=0/SUCCESS)
 Main PID: 17261 (code=exited, status=0/SUCCESS)

Feb 12 12:45:12 example.com systemd[1]: Starting Certbot...
Feb 12 12:45:13 example.com systemd[1]: Started Certbot.
$ sudo systemctl --failed
0 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

this service is running fine. exited without error.

but using slack-notifier, i'm getting this every time certbot.service runs:

Alert: systemd unit certbot.service on example.com still failed
Hostname
example.com
Unit
certbot.service
Active
inactive
Status
dead