troglobit / finit

Fast init for Linux. Cookies included
https://troglobit.com/projects/finit/
MIT License
621 stars 61 forks source link

Document initctl, including rc.local and runlevel S, limitations #254

Closed hongkongkiwi closed 1 year ago

hongkongkiwi commented 2 years ago

There’s two parts to my question,

First is I have a couple of commands that MUST run as the last thing (they are notifications that bout is complete). Currently they are fast but take some time (I’ll measure exact time later), I’m getting an error that finit timed out communicating with finit socket on this step.

Secondly, could we add a feature for users to set a custom timeout for /etc/rc.local and run parts? E.g. rclocal_timeout_ms=3000 runparts_timeout_ms=3000 this would prevent a runaway script from holding up boot. I guess the rc.local could also be compile time directive. It would make sense for runparts to be in config….. runparts

[]?

troglobit commented 2 years ago

This has been a recurring question actually. The runparts and rc.local are currently called in the last stages of bootstrap, they run in the foreground and block everything basically. Unfortunate but part of the legacy code base that has not been changed in a good long time. So not even signaling Finit with kill(1) would work here.

This stuff definitely needs to be updated and moved to a dedicated state in the big state machine, which would also make it possible to add timeout handling etc. As you can imagine, it's quite a bit of work, so nothing that goes into the 4.3 release, unfortunately.

hongkongkiwi commented 2 years ago

Im sure is a bug here as my task is very fast less than 1s, let me get you the exact timing when I get home so it shouldn’t timeout here.

For future stuff, here’s a thought, what about if we just had a flag on the task code which is: stage:last

That way any task or run command flagged with that runs last. Then the user can use runparts or just invoke rc.local themself. And it means it doesn’t even need to be a special config flag.

That makes it flexible to have a stage:first in future which can make some plugins potential obsolete such as dbus (since you can run it yourself as a first stage task

The stage:first would also mean I could easily run something like mdevd first without havnt to specifically set a running condition for every subsequent service.

hongkongkiwi commented 2 years ago

[ ⋯ ] Calling /etc/rc.localinitctl: Timed out waiting for reply from Finit.

Maybe this is just a warning? but it shows in my normal non-debug log. Perhaps it's not related to rc.local just shows at that time. My processor is quite slow around 800mhz so perhaps some calls are just a bit slow to return?

Running rc.local manually shows it only takes 750ms so it can't be that timing out

time /etc/rc.local
real    0m 0.75s
user    0m 0.07s
sys 0m 0.18s

Switching on debug log I can see that error message showing amoungst these lines: finit[1]: cond_set_path():/run/finit/cond/pid/mdevd <= 2 finit[1]: pidfile_handle_dir():path: /run/mosquitto finit[1]: iwatch_add():adding new watcher for path /run/mosquitto initctl: Timed out waiting for reply from Finit. finit[1]: pidfile_handle_dir():path: /run/gpsd

Also I've seen it here: [ ⋯ ] Starting Media daemonfinit[1]: service_start():Starting run-media-daemon initctl: Timed out waiting for reply from Finit. finit[1]: Starting media[355] [ OK ]

And here: finit[1]: service_step():network-eth0-ensure-mac( 0): ready enabled/clean cond:off initctl: Timed out waiting for reply from Finit. finit[1]: service_step(): ifup( 0): ready enabled/clean cond:off

Funnily enough only the first one around the time rc.local shows up in the non-debug log.

troglobit commented 2 years ago

So sorry, I should have been much clearer in my first reply; it is not possible to call Finit via signals or initctl in any runparts or /etc/rc.local script. This because Finit is single threaded and is calling these scripts in a blocking fashion (like run ...) at the end of runlevel S, at which point the event loop has not yet been started.

The event loop is the whole thing which Finit is built around, except for runlevel S, which remains a slow procession through a lot of set up in main(), with a few hooks, and blocking call outs to external scripts. Admittedly, this should be better described in the docs.

To fix this limitation a big refactor needs to be done (briefly described above). Having a stage:last and stage:first option to task could be useful in that refactor, definitely.

Not all initctl commands are prohibited, however, supported commands are:

So you could set a usr/ condition in /etc/rc.local and have a service/task in runlevel 2 depend on it to execute.

hongkongkiwi commented 2 years ago

Maybe there is mixed wires here (although what you wrote above is actually useful to know). I may have confused the issue by adding a feature suggestion, this is actually a bug in my case that I can't figure out.

I'm not actually doing anything in my rc.local except running my own command, but it appears to be timing out. My command doesn't use any initctl command.

Here is the contents:

#!/bin/sh
mkdir -p "/run/sys"
touch "/run/sys/boot_complete"
send-mqtt-cmd "boot/complete"

The time takes less than 1s so it should not be giving me a timeout message.

time /etc/rc.local`
real    0m 0.52s
user    0m 0.06s
sys 0m 0.10s

The bootup log looks like this:

[ OK ] Starting EarlyOOM daemon
[ OK ] Device Coldplug one-shot
[ ⋯  ] Calling /etc/rc.localinitctl: Timed out waiting for reply from Finit.
[ OK ]
[ OK ] Starting LTE QMI Proxy

Every boot shows the same message at the same place.

troglobit commented 2 years ago

Aha, I completely misunderstood Thank you, I'll have a detailed look and write a dedicated test for rc.local, to see if I can reproduce, later tonight.

Curious, maybe there's something else, started before, that calls initctl intead, maybe your "Device Coldplug one-shot"? Iirc you were doing a lot of task ... calls to ensure parallelism, which may be clouding the real culprit.

Side note. I briefly looked into how mdevd worked this past weekend, and how to call mdevd-coldplug when mdevd doesn't create a pid file to tell the world it's ready to party, was one of the things that confused me.

troglobit commented 2 years ago

No issue here with initctl: Timed out ... when running something similar from rc.local. Only thing I can see is that something else is running in parallel in the background, trying to contact Finit in runlevel S using initctl, which will not work (see comment from April 30).

hongkongkiwi commented 2 years ago

Thanks, I think I’ve fixed my issue based on your feedback. Some of the info here could be useful in the official docs specially about what initctl commands are allowed at what stage of the boot flow…. :) and some of the limitations of of rc.local

I’ll close the issue so it doesn’t clog up the issues/

troglobit commented 2 years ago

Yeah you're right. I'll see to it that we update the docs, thanks!