BytemarkHosting / symbiosis

A hosting environment that works with you, not against you.
GNU General Public License v2.0
21 stars 14 forks source link

Symbiosis monit failure emails in Stretch #129

Closed andrewladlow closed 5 years ago

andrewladlow commented 6 years ago

The symbiosis-monit script will return an exit code of 75 for a few reasons: if it's been disabled, if the machine is still booting, if the load is higher than the number of CPU cores, or if dpkg is running:

root@jessie:~# grep -c processor /proc/cpuinfo     
1
root@jessie:~# cat /proc/loadavg
4.00 4.00 3.87 5/130 5696
root@jessie:~# /usr/sbin/symbiosis-monit -t email /etc/symbiosis/monit.d -a
root@jessie:~# echo $?
75

In Symbiosis Stretch, this will be printed to syslog:

upgrade2 systemd[1]: symbiosis-monit.service: Main process exited, code=exited, status=75/n/a
upgrade2 systemd[1]: symbiosis-monit.service: Unit entered failed state.
upgrade2 systemd[1]: symbiosis-monit.service: Failed with result 'exit-code'.

And also as an email:

Subject: Symbiosis monitor detected service failure
root : TTY=unknown ; PWD=/ ; USER=nobody ; COMMAND=/usr/bin/tee /var/tmp/symbiosis-monit.cursor
pam_unix(sudo:session): session opened for user nobody by (uid=0)
Started Symbiosis monitor.
symbiosis-monit.service: Main process exited, code=exited, status=75/n/a
symbiosis-monit.service: Unit entered failed state.
symbiosis-monit.service: Triggering OnFailure= dependencies.
symbiosis-monit.service: Failed with result 'exit-code'.

Server load will frequently rise above the number of CPU cores on busy servers, generating a large amount of emails. Printing to syslog is useful if there are problems with the symbiosis-monit service itself, but we should probably only send a failure email when an individual test has failed (e.g. apache2), rather than the entire service.

andrewladlow commented 5 years ago

Fixed in https://github.com/BytemarkHosting/symbiosis/commit/f911398ef09d6a384321a3392f4e09890ceca0dc