naemon / naemon-core

Networks, Applications and Event Monitor
http://www.naemon.io/
GNU General Public License v2.0
151 stars 63 forks source link

launch command worker earlier #470

Open sni opened 2 months ago

sni commented 2 months ago

since the command worker forks from the main naemon process, it inherits all open files like ex.: pidfile, logfiles, etc... It will keep those references open, even if the main process rotates and reopens those files.

This patch closes query handler and pid file references after starting the command worker and also moves starting the command worker before initializing the neb modules, so it won't inherit open logfiles from neb modules.

references:

nook24 commented 2 months ago

The only issue I see with this is that the command file worker completely loses the ability to communicate. Currently, the command file worker is able to write to the naemon.log until this line: https://github.com/naemon/naemon-core/blob/22f0fb66fc0d7370ebbc5d2e6bcc66d8b9311392/src/naemon/commands.c#L392

After that, it can only write to stdout to log any errors. Not optimal, but better than nothing, I guess.

Due to the change, the command file worker can't even write to stdout anymore, so any error messages are completely gone. The static int command_file_worker method makes some calls to nm_log, which now get completely blackholed.

This change also introduces a sort of "breaking change". Currently, whenever Naemon logs the message Successfully launched command file worker with pid 12135, we know that Naemon has done all initial processing and is up and running.

I personal only use this when working or debugging Naemon directly, but it turns out that some scripts rely on this. For example, this one which is part of our repository: https://github.com/naemon/naemon-core/blob/22f0fb66fc0d7370ebbc5d2e6bcc66d8b9311392/features/steps/naemon.py#L138-L155

I think we should add a new message to the log that clearly indicates that Naemon is ready. Something like

Naemon initialization complete. Monitoring initiated.
sni commented 2 months ago

i am not to happy yet either, maybe we keep the order as it was but send a USR1 signal to the command worker so it can rotate/reopen the logfile as well.

sni commented 1 month ago

btw, the message Successfully launched command file worker with is logged from the main naemon process and should still be printed.

nook24 commented 1 month ago

maybe we keep the order as it was but send a USR1 signal to the command worker so it can rotate/reopen the logfile as well.

Currently the command file worker can only write to stdout. I guess this is due to it's a fork() and could otherwise break the log file if multiple processes wring to the log at the same time.

I think this is also the reason why the nm_core_worker is using printf instead of nm_log

btw, the message Successfully launched command file worker with is logged from the main naemon process and should still be printed.

Yes, this message gets still logged as before, but due due it is now before the broker modules get loaded, it does not indicate that Naemon is ready anymore. To be fair, this message should never indicate that Naemon is ready so I think we should not be too concerned about this. But I would still recommend to add a new log message to make it possible to detect if Naemon is ready.