Closed benarmston closed 1 year ago
Adding monit
support for emma
and mia
is more difficult than for the other rails servers / daemons. Both mia
and emma
have multiple instances of their servers and we wish to monitor and potentially restart them individually. For instance is emma
on port 9900
is consuming too much CPU or RAM, we want to restart only that instance of emma
.
However, our current systemd
unit files are auto generated from the /etc/init.d/
files and don't support individual starting and restarting of instances. We could have monit
restart the services directly instead of via systemctl
, however doing that causes systemd
to "lose track" of the services. For instance, in the below output, monit
has restarted the emma
9900
instance, we can see that it is running from the output of pgrep
however the output of systemctl status emma
shows that systemd
has "lost track of it".
root@command1:/etc/monit/conf.d# pgrep -a thin
2657572 thin server (127.0.0.1:9901)
2657583 thin server (127.0.0.1:9902)
2657962 thin server (127.0.0.1:9900)
root@command1:/etc/monit/conf.d#
root@command1:/etc/monit/conf.d#
root@command1:/etc/monit/conf.d# systemctl status emma
● emma.service - LSB: EMMA - the enhanced monitoring and management architecture
Loaded: loaded (/etc/init.d/emma; generated)
Active: active (running) since Fri 2022-10-14 16:52:25 UTC; 2 days ago
Docs: man:systemd-sysv-generator(8)
Process: 2657546 ExecStart=/etc/init.d/emma start (code=exited, status=0/SUCCESS)
Tasks: 67 (limit: 3530)
Memory: 342.8M
CPU: 25min 31.166s
CGroup: /system.slice/emma.service
├─2657572 "thin server (127.0.0.1:9901)" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
└─2657583 "thin server (127.0.0.1:9902)" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
Oct 14 16:52:22 command1 systemd[1]: Starting LSB: EMMA - the enhanced monitoring and management architecture...
Oct 14 16:52:22 command1 emma[2657546]: * Starting Phoenix EMMA
Oct 14 16:52:23 command1 emma[2657551]: Starting server on 127.0.0.1:9900 ...
Oct 14 16:52:23 command1 emma[2657551]: Starting server on 127.0.0.1:9901 ...
Oct 14 16:52:24 command1 emma[2657551]: Starting server on 127.0.0.1:9902 ...
Oct 14 16:52:25 command1 emma[2657546]: ...done.
Oct 14 16:52:25 command1 systemd[1]: Started LSB: EMMA - the enhanced monitoring and management architecture.
root@command1:/etc/monit/conf.d#
I'm not sure what the consequences of this are. systemctl restart emma
may or may not restart the correct services reliably. This may depend on the exact implementation of /etc/init.d/emma
. I am sure that this is the wrong way to use systemd
unit files.
A potential solution would be to have individual unit files for each emma
service and either make them PartOf
an "emma group" service or WantedBy
an "emma group" target. This solution will also work for mia
and is probably desirable to have for mongrel_rails
(aka the phoenix modules) too.
I think implementing that solution is outside the scope of this initial pass at getting monit
running. The services most likely to fail are being monitored.
Add
monit
to monitor our daemons. The monit configurations have largely been taken from concertim without change. We monitor that certain processes exist; that models are published over DRb connections; and that log files are getting written to.Some daemons have also been given resource constraints, which may or may not be set to appropriate values.
Utility scripts to help with the monitoring and clobbering of processes have also been added.
monit
is not started until MIA is fully configured. If we wish to build vanilla appliance ahead of their configuration, this may need rethinking.TODO: