christian-moerz / vmstated

bhyve virtual machine manager
BSD 2-Clause "Simplified" License
9 stars 1 forks source link

The man page is unclear on the behaviour of hooks. #3

Open Crest opened 8 months ago

Crest commented 8 months ago

Right now the vmstated man page says "It supports running scripts whenever the virtual machine enters one of the following states" (source). It doesn't say if vmstated waits for the hooks to finish and how if it at all it reacts to their exit code e.g. will it just spawn the hook, continue and hope for the best or will it wait for the hook to run to completion and consider it a failure should a hook return a non-zero exit code? It's also silent about the process state setup for hooks e.g. what happens to their standard output and standard error output? Does it get forwarded to vmstated's log through pipes, discarded to /dev/null, or even inherit vmstated's file descriptors?

christian-moerz commented 8 months ago

Very good points, thank you! Particularly the file descriptors - you might be up to something there. It's just doing a fork and execve and that means I'm leaking the open file descriptors. That certainly needs fixing...

christian-moerz commented 8 months ago

I also just realize that it waits for completion of hook scripts, which in the end might cause vmstated to hang if a hook script fails to complete for whatever reason...

Crest commented 8 months ago

It's just doing a fork and execve and that means I'm leaking the open file descriptors. Can you set close-on-exec by default on all file descriptors and only remove the clone-on-exec flag after forking on the handful file descriptors you want the child is supposed to keep after execve() into the hook?

Crest commented 8 months ago

I also just realize that it waits for completion of hook scripts, which in the end might cause vmstated to hang if a hook script fails to complete for whatever reason...

You have to wait for the hooks to run to completion (either success or failure) anything else would be make it impossible to reliably setup or teardown state. Just starting a hook and hoping to never loose the race (condition) would be at most useable for best effort logging. IMO the correct way to handle is to have one state machine per guest waiting for the hooks, but the state machines get to run concurrently so long running hooks don't hang others.

If you want to add quick'n dirty timeouts the per process timers like those used by alarm() don't get cleared by execve(). Assuming the hooks don't catch/ignore SIGARLM or worse block in an unkillable state (e.g. the classic unresponsive NFS mountpoint). This should be good enough to implement a configurable timeout, but the ability request vmstated to send signals to a specific guest hooks (race free assuming its either the parent (and thereby reaper) or uses process descriptors) through the control socket would also be useful to poke at hung hooks without having to match PIDs to guest hooks through unreliable hacks.

christian-moerz commented 8 months ago

Completely agree with the points you're making.

I'll probably need a separate thread for each start and shutdown so a long running process in a hook script does not stop the overall daemon from working. Otherwise, a startup could lead to the affect that a vmstatedctl status call would get either a timeout or a failure to connect to the socket.

Goes to show what happens when you start coding without fully thinking through the application design first :)

christian-moerz commented 8 months ago

I've added man page updates and flags on open in 068f1dca8a1a5ee720d17b0271e87e2b259b8d48. I have not thoroughly tested this yet, though.

christian-moerz commented 8 months ago

As expected, already one commit more because of a startup bug... 4fde3d45203b746fd3edc65dd3a1736770ee3ce6