NagiosEnterprises / ncpa

Nagios Cross-Platform Agent
Other
176 stars 95 forks source link

Adding ExecStopPost to service definition to remove pidfile #1077

Closed ne-bbahn closed 6 months ago

ne-bbahn commented 6 months ago

Sometimes NCPA didn't delete the pid file when a server shutdown/restarted, causing NCPA to think that another instance was already running and it would fail to start.

pittagurneyi commented 6 months ago

Would you mind taking a look at my last comment here https://github.com/NagiosEnterprises/ncpa/issues/1047?

If we could offload the task of the pid file handling to systemd, we'd have resolved the problem for good.

Also, systemd by default - if I remember correctly - does not allow a service unit to be started twice, so unless the user were to start ncpa manually and via systemd, then it is always ensured that only one instance is started.

ne-bbahn commented 6 months ago

I think the problem here was that not all machines that install NCPA use systemd. This behavior could be modified at some point to rely on systemd if systemd exists.

pittagurneyi commented 6 months ago

There seems to be some kind of misunderstanding ...

I think the problem here was that not all machines that install NCPA use systemd.

That is obvious. It will never be that all machines that install NCPA use systemd. FreeBSD, which is running on many of my systems, is a good example, which will probably never have systemd.

What I meant is, can either NCPA detect that it is started by systemd by some logic or by adding a --systemd command-line option that NCPA is then started with (via the systemd unit file), which disables the internal pid-file handling?

That way we don't have to hack it on the (many) systems that start NCPA via systemd and which advises against this type of pid-file handling.

Or the other way around, as init script systems always evolve and find new ways to automatically handle pid-files, etc., create an option that enables the internal pid-file handling and let it default to it's-the-operating-systems-job. But that would probably be a breaking change and I don't know if all installation methods on all operating systems always replace the init scripts.

pittagurneyi commented 6 months ago

This would probably work, but I haven't tested it:

import psutil, os
ppid = psutil.Process(os.getpid()).ppid()
ppid_name = psutil.Process(ppid).name()
if ppid == 1 and ppid_name == "systemd":
    disable_pidfile_handling = 1

Note, it relies on NCPA being started by systemd, otherwise it is not the ppid of the NCPA process.