Open Crest opened 8 months ago
Very good points, thank you! Particularly the file descriptors - you might be up to something there. It's just doing a fork
and execve
and that means I'm leaking the open file descriptors. That certainly needs fixing...
I also just realize that it waits for completion of hook scripts, which in the end might cause vmstated
to hang if a hook script fails to complete for whatever reason...
It's just doing a
fork
andexecve
and that means I'm leaking the open file descriptors. Can you set close-on-exec by default on all file descriptors and only remove the clone-on-exec flag after forking on the handful file descriptors you want the child is supposed to keep afterexecve()
into the hook?
I also just realize that it waits for completion of hook scripts, which in the end might cause
vmstated
to hang if a hook script fails to complete for whatever reason...
You have to wait for the hooks to run to completion (either success or failure) anything else would be make it impossible to reliably setup or teardown state. Just starting a hook and hoping to never loose the race (condition) would be at most useable for best effort logging. IMO the correct way to handle is to have one state machine per guest waiting for the hooks, but the state machines get to run concurrently so long running hooks don't hang others.
If you want to add quick'n dirty timeouts the per process timers like those used by alarm() don't get cleared by execve()
. Assuming the hooks don't catch/ignore SIGARLM
or worse block in an unkillable state (e.g. the classic unresponsive NFS mountpoint). This should be good enough to implement a configurable timeout, but the ability request vmstated
to send signals to a specific guest hooks (race free assuming its either the parent (and thereby reaper) or uses process descriptors) through the control socket would also be useful to poke at hung hooks without having to match PIDs to guest hooks through unreliable hacks.
Completely agree with the points you're making.
I'll probably need a separate thread for each start and shutdown so a long running process in a hook script does not stop the overall daemon from working. Otherwise, a startup could lead to the affect that a vmstatedctl status
call would get either a timeout or a failure to connect to the socket.
Goes to show what happens when you start coding without fully thinking through the application design first :)
I've added man page updates and flags on open
in 068f1dca8a1a5ee720d17b0271e87e2b259b8d48. I have not thoroughly tested this yet, though.
As expected, already one commit more because of a startup bug... 4fde3d45203b746fd3edc65dd3a1736770ee3ce6
Right now the
vmstated
man page says "It supports running scripts whenever the virtual machine enters one of the following states" (source). It doesn't say ifvmstated
waits for the hooks to finish and how if it at all it reacts to their exit code e.g. will it just spawn the hook, continue and hope for the best or will it wait for the hook to run to completion and consider it a failure should a hook return a non-zero exit code? It's also silent about the process state setup for hooks e.g. what happens to their standard output and standard error output? Does it get forwarded to vmstated's log through pipes, discarded to/dev/null
, or even inheritvmstated
's file descriptors?