threefoldtech / zinit

A init replacement that feels like runit written in rust+tokio
Apache License 2.0
10 stars 1 forks source link

process group support and more #59

Open muhamadazmy opened 6 months ago

muhamadazmy commented 6 months ago

Part 1 (process groups)

In zos, it happens that we need to group a set of processes in a logical group (not related to linux process group, although this can be needed for the implementation).

The process group by zos is a logical group of processes that can be monitored, started, stopped in one go. Internally in that group, the processes can still has dependencies across processes defined within this group. This is beneficial on many levels as explained below:

What we need to do?

Note: to implement this, maybe zinit should abstract grouping of processes and assume all services that are defined at the host level are a nameless group (host group), which has sub-groups, which means in general groups can be nested. This will make it easier to think about group and to implement

Part 2 (restart on dependency death)

Right now service dependency can only specified with the after configuration flag. It basically tells zinit that a service A can only be started after B if A is configured like

after:
  - B

This is cool and all, and it covers like 95% of uses. But after A is started, service B is never checked again, it means that if B died A is kept alive. It's up to A to "detect" loss of connection and then try again, or even completely exit until B is started again.

In some situation is's required to actually assume that A is now in a bad state, and that we need to automatically stop A until B is started again.

This is exactly the case with VMs and virtiofsd, if an virtiofsd died, the VM won't die, but we will start getting IO errors inside the vm. Starting the virtiofsd again won't fix the issue, but we will still also need to restart the VM as well.

This is why i really think we need to introduce another dependency flag that can be conditioned as well. Say requires, the only condition i can think of right now is an always-restart condition as follows:

requires:
  always-restart:
    - B

Note: this is an initial syntax that can be changed, IMHO it's confusing

I think this way to configure it is a little bit confusing because what it actually means is if B dies, you need to restart A (the one that is being configures). Note that, a requires implies after.

IMHO we need to focus on first issue first (the groups) and then see if we really need the requires flag