skarnet / s6-rc

A service manager for s6.
https://skarnet.org/software/s6-rc/
ISC License
121 stars 10 forks source link

Dependency propagation on bundles #9

Closed BikyAlex closed 5 months ago

BikyAlex commented 5 months ago

The s6-rc-compile seems to be missing / not take into account dependencies created on a type=bundle service.

The documentation doesn't mention it, so I thought it might be a missed entry, given that "flag-essential" on a bundle will be propagated on all services present in the bundles content.d folder. https://skarnet.org/software/s6-rc/s6-rc-compile.html

If you make a dependencies.d inside of a folder that has a type=bundle and add files with the names of services in it, just like you'd do with atomic services, this dependency is currently just ignored and doesn't get propagated.

Is there a way to globally define a common dependency on a whole group? I'd like to avoid having to individually go through each atomic service in a bundle and adding the dependencies.d directory for all of them.

As a few examples (yes, I know you can just "s6-mount -a" in a single service, this is just an example):

Another one:

Imagine I had more services, none of them requiring much dependencies (if any), just making sure they are in the proper "runlevel" (for lack of a better term). These would be getting included in their respective bundles, but all of them would have a "common ancestor" which can either be a bundle or a service.

When setting up serialized dependencies, it's easy, as you only need to add the immediate parent and in very rare cases a bit more than 1 dependency. But when launching a massive amount of services in parallel, that's where it gets annoying to set up dependencies (particularly if the atomic services don't even require dependencies at all, other than early startup things like udev, fs mounting and networking).

As an example of that, setting up an s6-rc service for each virtual machine I'm running would imply parent dependencies: networking (oneshot or longrun) -> iscsid (longrun) -> iscsi-login-logout (oneshot) -> vm1, vm2, vm3 etc. (longrun). But for all the VMs, the dependency has to be the iscsi-login-logout service, otherwise the VMs won't start (no attached disks). And let's say I put in the effort at first to write the dependencies.d/iscsi-login-logout at first for 300 VMs. Then something changes (say I move everything to local volumes instead of iscsi), so now I need to change 300 VMs' dependencies to something else (idk, making sure a particular disk is mounted and not in a failed state, say mount-vmdir, or zpool-import-vm-pool).

If I knew ahead of time that changes are inevitable, I'd have probably played it smart and made a bundle "vm-requirements" and added iscsi-login-logout to it and inside 300 VMs dependencies.d, I'd only have vm-requirements bundle. That'd make it trivial to just change the contents.d of the bundle from iscsi-login-logout to mount-vmdir or zpool-import-vm-pool. But I'd still have had to put in the time to write 300 dependencies.d/vm-requirements (assuming I didn't script it, which is probably what I'll do for now).

BikyAlex commented 5 months ago

With the above examples out of the way, I think the easiest method would be to have dependencies.d propagate on a bundle. But if you know of a better way of doing it that already exists, I'm open to ideas.

Now, some background, which you can skip, as I finished all the technical review above. I'd still appreciate you (anyone) reading.

I'm trying to work on migrating systems to s6-linux-init and s6-rc and I wish to share that with the world, but for right now, this missing feature above is stopping me from doing a decent job of ensuring everything respects the dependency order.

How I've got things currently set up, all services are split in "runlevels" with a bundle for each runlevel. When booting, s6-linux-init just calls s6-rc -up change default ("$rl"). So I made a bundle called default and slapped all "runlevel bundles" inside the "default bundle." This results in all services "enabled" (i.e. added to the main bundles) to get started, but because the atomic services don't each have dependencies on the lower runlevel, they all start at the same time and in a wrong order.

I found out the hard way that things like sshd, chronyd and dhcpcd are trying to start before even dev, sys, proc and root are mounted. They're all in the bundle multi-user-target, which contains dependencies.d, but a "s6-rc-db -u dependencies sshd" shows that the only dependency this has it s6rc-fdholder. Obviously, I didn't know that bundles don't support dependencies. Sounded intuitive, but the manual clearly shows that's not the case.

I also want to take a moment to appreciate the good work you're doing here and thank you very much for your effort. You're an unsung hero.

skarnet commented 5 months ago

Thank you for your comments and your detailed analysis. Please note that the official place for bug-reports and feature requests for s6-rc is the supervision mailing-list 😉

So, the thing with bundles is that, in the current s6-rc incarnation, they are not real entities. A bundle is nothing more than an alias for a set of services; it's treated differently by s6-rc-compile, yes, but at the s6-rc level, it's only syntactic sugar. A bundle has no existence per se. And so, it is not possible to have a bundle depend on other services.

When I wrote s6-rc back then, I did think about propagating declared dependencies to all the services contained in a bundle. With my co-designer (heliocat), we spent a lot of time discussing it. And the final conclusion was that it caused more problems than not doing it. For instance, if you add bundles from another source directory, it can add invisible dependencies to a service, that you cannot predict when just looking at the service's source directory! That is more unintuitive than not having dependencies propagate. And it's far from the only problems we had with it.

So, yeah, dependencies aren't propagated. You can have an atomic service depend on a bundle because it's an easy operation - depend on all the services in the bundle - but depending on something is a property of an atomic service, it cannot be a property of a bundle.

And it makes sense. Dependencies are used for service ordering; if a bundle had dependencies, how would you order the services inside of the bundle? You have to have dependencies defined for individual services no matter what.

This is one of the reasons why I wanted the source directory format to be easily scriptable. So if you need to preprocess your service list, if you have automation that needs to output or edit source directories, I want to encourage you to do it. If you have 300 services you need to add a dependency to, then yes, scripting it is the right approach, and the format should not get in your way.

BikyAlex commented 5 months ago

Alright, I trust what you said. I was thinking after posting the above how would dependencies be propagated from a bundle to another bundle to the actual atomic services. I understood that bundles are just aliases of multiple services, and that's why I found it "intuitive" when initially messing with bundles and dependencies.

I now understand the approach on the dependencies. I also view the dependency handling in another light, now that you pointed out some of the design principles. So the idea is to improve upon the sources through scripting. Here's some examples of what I've done.

The idea in my head was to split the bundles in source folders and order the folders to make some sense of the dependencies (the numbers aren't present in the s6-rc db, it's just for human-readableness). All the atomic services in the higher number must have at least a single dependency on the immediately previous lower number., in addition to whatever other dependency they have.

# cd /etc/s6-rc/source

#  ls -1
00-default
01-ok-all
02-ok-init
03-ok-local
04-ok-multi-user

# ls -1 00-default/default/contents.d/
00
ok-all

# ls -1d 01-ok-all/*
01-ok-all/ok-init
01-ok-all/ok-local
01-ok-all/ok-multi-user

#  ls -1 03-ok-local/*/dependencies.d
03-ok-local/multi-user/dependencies.d:
ok-init
net-lo
rc-local

03-ok-local/net-lo/dependencies.d:
ok-init

03-ok-local/rc-local/dependencies.d:
ok-init

#  ls -1 04-ok-multi-user/*/dependencies.d
04-ok-multi-user/acpid-log/dependencies.d:
ok-local

04-ok-multi-user/acpid/dependencies.d:
ok-local

04-ok-multi-user/dhcpcd-eth0-log/dependencies.d:
ok-local

04-ok-multi-user/dhcpcd-eth0/dependencies.d:
ok-local

04-ok-multi-user/openssh-server/dependencies.d:
ok-local

04-ok-multi-user/sshd-log/dependencies.d:
ok-local

After the ok-local bundle is started, any other service in ok-multi-user can be launched, with whatever dependencies it needs. There's no need to worry about the lower-level dependencies, as long as all the ok-multi-user services have at least the dependencies.d/ok-local entry.

And to ensure that is the case, it's as easy as:

SRCDIR="/etc/s6-rc/source/04-ok-multi-user"
VERB=2

for LOWDIR in $(ls -1 ${SRCDIR})
do
  SRCTYPE=$(cat ${SRCDIR}/${LOWDIR}/type)
  if [ "${SRCTYPE}" = "longrun" ] || [ "${SRCTYPE}" = "oneshot" ]
  then
    mkdir -p ${SRCDIR}/${LOWDIR}/dependencies.d/
    touch ${SRCDIR}/${LOWDIR}/dependencies.d/ok-local ;
  else
    [ ${VERB} = 2 ] && echo "${SRCDIR}/${LOWDIR} is neither a oneshot nor a longrun service."
  fi
done

Of course, to improve on the above, to specify the path and automatically get the dependencies. It probably can even be written in execline, for better compatibility, but I just wrote something really quick for my testing.

Closing the issue. My release of whatever I'm working on (probably won't even have a decent name) will probably be done on the Level1Techs Forum and maybe later released on github. My plan is to use an OS with an s6 base to handle some server dependencies (like the aforementioned iscsi-login-logout stuff) to host my own services (like a git instance).