cloux / aws-devuan

systemd-free GNU/Linux for AWS Cloud Environment
Do What The F*ck You Want To Public License
20 stars 4 forks source link

Help with devuan runit implementation #1

Closed Tank-Missile closed 4 years ago

Tank-Missile commented 5 years ago

I've been following the devuan mailing list and quite a few people are in agreement that the way debian handles runit isn't exactly up to par. I also reported a few problems with the way it was implemented to the debian runit maintainer, such as sulogin being started in stage 1. The getty-run package also had problems with no job control in the shell, but that was rectified. I'm still not 100% happy with the getty scripts though. I much prefer void's method. I mainly use Artix, and fell in love with runit for its simplicity and not doing as you said in the description, "Do everything, do it in PID1". Artix's implementation is a bit odd in its usage of "core-services" or "sv.d" scripts, however. According to one of the maintainers, this is so each phase in the boot process can be "monitored" if it completed successfully or not. Void takes a much simpler approach, and it seems your implementation essentially copies it with a few changes and additions. For running a desktop, Artix's implementation works perfectly fine, except for bootlogd. In fact, bootlogd is quite iffy on Artix, Debian, and even here. In layman terms:

Debian: Uses sysvinit scripts to start system and bootlogd, and then kills bootlogd in rc2.d.

Devuan: Same problems.

Artix: Uses its own "rc" commands but does not generate the bootlog for some god forsaken reason.

I also wonder why bootlogd, udev, and other important start processes are not treated as services in a runsvdir in stage 1 before moving to another runsvdir in stage 2. According to the runit documentation, switching runsvdirs is supported. Then again, if I remember correctly, switching runsvdirs would kill any services in the previous runsvdir. I wonder if multiple runsvdirs at once is even possible then. In any case, this is a great project that shows that runit can work in a debian based distribution without the need for sysvinit scripts for booting. If we keep in contact with the Devuan maintainers, we could possibly get your changes in offficially (don't quote me on that).

Now, why not try to get these changes into Debian as well? Well, debian strives to be backwards compatible according to the runit maintainer, but it barely functions on its own with its current implementation anyway. An overhaul shouldn't be out of the question.

One more thing is the service directory location. /etc/service is not the wisest place to store running services, since /etc can become readonly. In conclusion, I hope to gain more insight from you on this matter.

cloux commented 5 years ago

Hi, you made some very good points.

the way debian handles runit isn't exactly up to par

Debian uses systemd, and doesn't handle any other init equally. It's very hard to change the init there because of the fair amount of dependencies. That's the reason Devuan, Artix and many others came to being, to provide repos with packages that are not dependent on any specific init.

Sadly, Devuan doesn't handle runit up to par either: see the init metapackage, it pre-depends on (sysvinit-core | upstart). If you want to install runit-init, you have to remove conflicting sysvinit-core, which removes init, which you have to acknowledge by typing exactly "Yes, I am aware this is a very bad idea". Ugly.

On the other hand, runit-init is not handled as being up-to-par because it is not. It's missing runscripts. My runscripts are based on VoidLinux, but now almost completely rewritten and extended. No other distro except VoidLinux provides meaningful runscripts by default. Devuan has just some basic weird stuff, like you said, starting sulogin and a single terminal and that's it.

I much prefer void's method

Same here. VoidLinux makes a simple, elegant and stable impression. And while I appreciate it's packaging system xbps, I mostly prefer the DEB base. That's why I tried to integrate the Void-runit approach into Devuan. It's solid.

I mainly use Artix

That seems like a good distro. If systemd-free Arch base is required, definitely worth a try. However, I never used it so can't comment on it's implementation.

Artix's implementation is a bit odd in its usage of "core-services" or "sv.d" scripts, however. According to one of the maintainers, this is so each phase in the boot process can be "monitored" if it completed successfully or not.

First, "core-services" is a confusing misnomer. The word "core" does not contain much information: almost anything in a computer can be regardes as a "core" component. The word "services" usually describes programs that daemonize and run in the background. But there are NO DAEMONS in the /etc/runit/core-services path, just simple scripts executed sequentially in the first runit stage by /etc/runit/1, or as you correctly called it, the boot process. I renamed the path to /etc/runit/bootup in my repo, to make more evident what's going on.

Second, this stage can be completely and fully "monitored" by bootlogd service, as you suggested above. It took me a while to figure this one out, so let me explain how it works (my repo feature only):

The logfile of the bootup stage is /var/log/boot.log, and the logfile from the previous boot is /var/log/boot.log.1. The logfiles of the scripts from /etc/runit/autorun/ are saved as /var/log/autorun-SCRIPTNAME.log.

Void takes a much simpler approach, and it seems your implementation essentially copies it with a few changes and additions.

Yes!

For running a desktop, Artix's implementation works perfectly fine, except for bootlogd.

I was primarily targeting cloud deployments, but now I run the same thing on my laptop. It works perfectly fine, including bootlogd. I also included some desktop-only runscripts, like X11dm to start any display manager, or wicd for network management.

Artix... does not generate the bootlog for some god forsaken reason.

Usually there are 2 reasons: either bootlog starts too late, or is missing /dev/pts pseudofs when it starts. Both need to be taken care of very early on, see my stage 1 script. Don't know how to handle this in Artix, or when combined with OpenRC.

I also wonder why bootlogd, udev, and other important start processes are not treated as services in a runsvdir in stage 1 before moving to another runsvdir in stage 2

That would be a bad idea and complicate things. Stage 1 is strictly sequential, runsvdir is for supervising things running in parallel. You want bootlogd started before anything else. You want run udev after the /dev is mounted but before you initialize filesystems or network. The bootup stuff usually has strictly sequential dependencies/precedence, it can't run in a parallel supervisor.

if I remember correctly, switching runsvdirs would kill any services in the previous runsvdir

No. Services present only in the old runsvdir will be closed, services present only in the new runsvdir will start. Services present in both runsvdirs will continue running as if nothing happened.

I wonder if multiple runsvdirs at once is even possible

Unnecessary, single supervisor for all services is sufficient. Again, stage 1 doesn't need supervision. If required, you can start daemons in stage 1 unsupervised, like bootlogd or udevd. Works just fine. You can see udevd here at the bottom, running daemonized outside of the runsvdir process supervisor: screenshot

This is because it got started and daemonized early in the 1st stage by /etc/runit/bootup/02-udev.sh. If you enable supervised udevd in stage 2, the udevd runscript automatically takes care of closing the unsupervised instance. Simple, single line of code.

If we keep in contact with the Devuan maintainers, we could possibly get your changes in officially

That would be nice, maybe more people would find runit helpful. My work is here public domain unlicense-type. Do whatever you want, literally, you don't need my blessing.

Now, why not try to get these changes into Debian as well? Well, debian strives to be backwards compatible according to the runit maintainer, but it barely functions on its own with its current implementation anyway.

There is a lot of discussion about this, for years now. I don't want to get into it.

/etc/service is not the wisest place to store running services, since /etc can become readonly

I totally agree. In Devuan, /etc/service/ is a symlink to /etc/runit/runsvdir/default which contains symlinks to supervised services. These symlinks point to service definitions in /etc/sv/. So far so good, it's all configuration, that fits to /etc. BUT: runit then puts its runtime information into _/etc/sv/SERVICENAME/supervise subdirectory!!! That is runtime-dependent variable information, should be e.g. in /var/run and not in /etc. This was done probably to keep the codebase simple, but I see this as a bug or at least as bad practice.

Tank-Missile commented 5 years ago

Thanks for all the explanations! What's weird about bootlogd on Debian is that by default the daemon is stopped at runlevel 2 (rc2.d). Is there a reason to keep supervising bootlogd even after the initial bootup of the system has completed? Other than that, I may just try installing what you have here onto my own server, since I can't exactly get artix's implementation working on it 100%. Only problem is your implementation doesn't allow for custom runsvdirs, such as a runsvdir for single/sulogin, or other custom runsvdirs. Void uses runsvchdir to change a symlink from /etc/runit/runsvdir/current to default or whatever runsvdir I choose on the kernel paramters. I'm assuming you didn't implement it this way for backwards compatibility?

cloux commented 5 years ago

The bootlogd on Debian is stopped at runlevel 2 (rc2.d). Is there a reason to keep supervising bootlogd even after the initial bootup of the system has completed?

Good question. That depends on what is your use case. 'man bootlogd' says: "bootlogd copies all strings sent to the /dev/console to a logfile". That's it, that's all it does. The early init scripts can't do much logging, because the system is not ready yet. But /dev/console is available early, so it's the only place to print some messages. Later as other logging facilities become available (syslog, socklog, svlog...), there is no need to use /dev/console for output, and bootlogd can be closed. After all, it's called "bootlogd" and not "consolelogd", it's meant to log boot process only anyway.

Consider this scenario: You start bootlogd and keep it running (as I do). Then you activate incron service, but forget its dependency on socklog. So incron starts before socklog. On start it tries to connect to the syslog facility by openlog() function, which fails, and incron then falls back to logging everything to /dev/console. This fallback algorithm is hardcoded. If socklog becomes available later, incron ignores that because it's dumb that way and happily continues to flood the /dev/console.

If you're a normal user, you probably don't want to open your boot.log and see a bunch of completely unrelated weird service messages. That's why a conscious distro has bootlogd disabled shortly after boot. As on Debian: bootlogd is active only in rc.S (early initialization) and stopped everywhere else.

Me, I want to see everything that goes to the console, so I can fix it. Not just boot (stage 1) messages, I want to see services polluting the console during normal operation (stage 2), and also all messages during shutdown. Polluted /dev/console is more than just a cosmetic issue, the /dev/console is also what the user sees on /dev/tty1, on the real terminal. Have you ever tried to log into the real terminal (Ctrl+Alt+F1), or do maintenance on a server directly, not over SSH? You know the feeling when you try to type the username into the login prompt and the server is spitting up some log messages right into it? Even on Amazon EC2 it's a problem, when due to the polluted console you won't see the boot messages anymore because you can't scroll back enough...

Keeping bootlogd enabled all the time helps me to easily spot any console pollution and prevent this nasty behavior. On the other hand, in normal usage (production) you could disable it completely. As long as the system is booting properly, you don't really need any boot logs.

Only problem is your implementation doesn't allow for custom runsvdirs

Well, I've always used only one runsvdir for simplicity, so I checked and... Dammit! You found a serious error. I use the runit binaries and symlinks as provided by Devuan packages, and they're broken! They provide a symlink /etc/service to runit/runsvdir/default instead of /etc/runit/runsvdir/current (package runit_2.1.2-19_amd64.deb). The VoidLinux implementation is correct, but I only used the VoidLinux runscripts, not the filesystem structure. Somebody should report this to the Debian/Devuan package maintainers.

Anyway, I am going to fix that in my repo right now... will post here when it's fixed...

I may just try installing what you have here onto my own server

Wait for the symlink fix. Then, if you experience any problems, feel free to open another issue.

cloux commented 5 years ago

Ok, the symlinks from the Devuan repository package are fixed. The installation should go as follows:

That's it, in short. Let me know if there is any issue.

Tank-Missile commented 5 years ago

I'm not really comfortable installing anything without the usage of a package manager, even if socklog is a better... actually why did you use socklog? Doesn't runit come with its own log implementation? Also, I noticed in stage 2 you made it so you needed a "runlevel=" before specifying the runlevel. I assume you made this change to avoid the need for a for loop? Either way it does look somewhat cleaner than how void did it. There has been more discussion on the runit package in debian, sparked by an issue I reported. If you're curious about what was changed, read here. I'm not 100% happy with the changes, but at this point I just started making my own modifications. I messed around with the symlinks, and modified stage 1 and 2. This works better in tandum to Debian's runit, but still relies on sysvinit scripts to start the system. I'm constantly arguing with myself wether I should avoid sysvinit entirely, even with the boot process, or just use it for booting and use runit to maintain services. Now, back to where we were. I don't actually use grub, but instead refind. It has a much cleaner UI and I find it easier to configure. I'm not using Amazon AWS, but just a personal machine. With that in mind I'll have to take a more manual approach to using your runit implementation. Once I have a disposable income, I would deffiently look into Amazon AWS or another affordable option.

cloux commented 5 years ago

Sorry for the really late reply, I had to take some time off.

I also much prefer to use the official repositories, if possible. Unfortunately, not everything I need is there (socklog, hiawatha, gitahead, oomd, arduino-core libs ...). I also need software that will never make it to a main repo (slack, oracle-jre, teamviewer, rew...), or I require custom compilation parameters so I have to compile things from source (kernel, php-fpm). I will deal with this mess in the upcoming release, and provide a simple modular installer/updater interface for all these cases.

Yes, runit comes with its own logging implementation, and that's socklog, see http://smarden.org/socklog/ - "socklog, in cooperation with the runit package, is a small and secure replacement for syslogd". That being said, I am not a big fan of socklog (daemontools logging in general) either. The Idea is right, but the implementation is a bit painful. Maybe I'll look into that someday.

To the "runlevel=" kernel parameter: you're right, it looks cleaner if I don't iterate over all parameters in a for loop. But the main reason to use a prefix is different: runit (daemontools in general) has a flexible runlevel concept: you can have as many as you want and name them as you want. So if you read 'quiet' on the command line, is it a kernel parameter or the name of your runlevel? In order to be deterministic it's a good idea to mark the runlevel with a prefix. Also, using a prefix it is more obvious to a third person what is going on here. The downside: this prefix is not portable, you have to use runit-init with my runit scripts to parse this, no other distro is doing it this way. That said, it should be a rare requirement to specify a runlevel explicitly. On all my servers, PC and laptop I use only one default runlevel, no need to complicate things.

The bug as submitted by alecfeldman on the link you posted is spot on, and there are some more problems with the runit distribution packages (e.g. missing socklog, missing runscripts, reboot/shutdown doesn't work properly in cloud, system initialization not implemented, only 'single' runlevel provided, rather useless getty implementation...). Trying to push all these changes upstream seems a lost case anyway, since Debian relies primarily on systemd for system initialization and service management. Runit is treated as an optional additional supervisor, nothing more. That is the reason for this repository, to be able to use runit as main init system and service supervisor (aka. VoidLinux for DEB base and then some).

Yes, it's very much possible and easier than it seems to avoid sysv-init and run everything from runit. System initialization, service supervision, shutown. The early boot initialization is good enough to leave out initramfs completely, and just go straight Grub (or rEFInd) -> kernel -> runit! I am runnig it like this on my Laptop right now and it works perfectly, the boot is blazing fast. Note: to run GUI you will need to enable the X11dm service. I am working on a script that would automate the transition to a pure runit system. Preferably for any DEB-based OS, on PC or Laptop, not just in the cloud.

The usage of GRUB vs. rEFInd should make no difference. Kernel needs to know where the init is (''init=") and that's it. The major added value of my repository is, that there are changes inspired by VoidLinux that allow runit to perform early initialization, so you don't even require initramfs. If system still has initramfs, runit just skips the already initialized parts. Simple.

cloux commented 5 years ago

FYI: on my Laptop I completely uninstalled init, initscripts, sysv-rc, insserv, sysvinit and systemd-. Also I don't parse or run anything from /etc/rcS.d. That "mess" is re-implemented in a cleaner way in my runit boot stage.

Tank-Missile commented 5 years ago

The problem is if I attempt to remove initscripts, sysv-rc, or innsserv, runit-init would be removed as well. God, what a mess. I hope Devuan decides to say "screw backwards compatibility" in this case and fixes runit along with other init systems.

cloux commented 5 years ago

Yes. I forgot to mention that: due to that mess, I also removed the package runit-init and use just the binaries in my system, outside of the package :(
This is currently the only way to get rid of initscripts and the useless getty-run package dependencies. What a mess indeed.

As far as I know, Devuan just pulls runit packages from Debian, there seems to be no difference in dependencies. Currently my runit-init says Origin: Devuan:3.0/testing, Maintainer: Dmitry Bogatov KAction@debian.org

From what I've seen, there seems to be no separate development for runit in Devuan (or anywhere else, except Void). And yes, these packages are not maintained in a way that would make it easy or possible to switch to runit at any level. Fixing and manual labor is expected.

I will provide a simple "installer" - scriptable system that will make it easy to make these system changes outside of the repository, in a maintainable and repeatable manner. Hopefully.