MetPX / sarracenia

https://MetPX.github.io/sarracenia
GNU General Public License v2.0
46 stars 22 forks source link

systemd integration limited: systemctl status does not work. #1318

Open petersilva opened 13 hours ago

petersilva commented 13 hours ago

on older operating systemd (redhat8, ubuntu 18.04) at least... the systemd versions do not check for status correctly for sr3. There is some incompatibility. e.g.:

sarra@host:~$ systemctl status metpx-sr3
● metpx-sr3.service - Sarracenia File Copy Service
   Loaded: loaded (/lib/systemd/system/metpx-sr3.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Tue 2024-12-03 19:03:55 UTC; 21min ago
  Process: 36353 ExecStop=/usr/bin/sr3 stop (code=exited, status=0/SUCCESS)
  Process: 2889 ExecStart=/usr/bin/sr3 start (code=exited, status=0/SUCCESS)

Dec 03 12:24:43 host systemd[1]: Starting Sarracenia File Copy Service...
Dec 03 12:24:58 host sr3[2889]: starting:.........................( 248 ) Done
Dec 03 12:24:58 host systemd[1]: Started Sarracenia File Copy Service.
Dec 03 19:03:55 host sr3[36353]: Stopping: no procs running...already stopped
sarra@host:~$

So one must use sr3 status to check properly.

Also monitoring is done by having sr3 sanity run periodically using cron (e.g. every 5 minutes.) which will restart crashed processes, and kill strays should any be present.

There is no know approach to fix this at the moment... but newer OS versions seem to understand sr3 better (e.g. ubuntu 24.04.) this problem may go away with time.

petersilva commented 13 hours ago

fundamental issue is that while we can use sr3 start and sr3 stop by just plugging them into systemd unit files... there is no equivalent place to put sr3 status, and sr3 being a thing that starts up many, many child processes, is not a "normal" service. Judging by google searches, this is a known problem with earlier versions of systemd.