cloux / aws-devuan

systemd-free GNU/Linux for AWS Cloud Environment
Do What The F*ck You Want To Public License
20 stars 4 forks source link

/var/log subvolume #2

Closed Tank-Missile closed 4 years ago

Tank-Missile commented 5 years ago

For people like me who separate their /var/log directory into its own btrfs subvolume, bootlogd generates its log before the /var/log subvolume is mounted from /etc/fstab. I could avoid this by simply creating a subvolume at the directory, but I prefer the mount approach. The bootlogd sysvinit script solves this issue by first generating the log in /run and then copying it over to /var/log. I'm not so sure how this effects the rest of the software that uses the directory.

EDIT: I made an adjustment to stage 1 and 2 that involve generating the boot log in stage 1 in /run and moving it to /var/log in stage 2. A few tweaks here and there on your end could mean less tweaking on my end. For instance, add the option to not use the bootlogd service after the initialization of the system. I probably should put that suggestion in its own issue though.

cloux commented 5 years ago

Again, sorry for the late reply.

I am not sure that I understand your issue correctly. You have an entry in /etc/fstab like /dev/something /var/log ... and then the boot.log file gets dumped into the /var/log/ even before that gets mounted, is that correct?

In that case, sysv-initscript solution is not very good either. It adds unnecessary code complexity, and it only dumps the file to var/log when stopped in a later runlevel, hoping that it should/would be mounted by then. That might be enough for your use case. However, consider the bootlogd manpage: Bootlogd runs in the background and copies all strings sent to the /dev/console device to a logfile. If the logfile is not accessible, the messages will be kept in memory until it is.

There is a much simpler solution: in my /etc/runit/1, you can change the

/sbin/bootlogd -c -l /var/log/boot.log

to:

/sbin/bootlogd -l /var/log/boot.log

See the manpage: "-c: Attempt to write to the logfile even if it does not yet exist. Without this option, bootlogd will wait for the logfile to appear before attempting to write to it. This behavior prevents bootlogd from creating logfiles under mount points."

Then you can touch /var/log/boot.log on your mounted volume and you should be done.

Would that be acceptable solution to you? Let me know if that works. NOTE: I removed the "-c" in my dev branch and do the touch from the service runscript. This however still expects the /var/log/ to be mounted in the runit stage 2.

Tank-Missile commented 5 years ago

That doesn't seem to work for me unfortunatly. I modified bootlogd in stage 1 to this:

/sbin/bootlogd -p /run/bootlogd.pid -l /var/log/boot.log

In stage 2, I added these two lines:

[ ! -f /var/log/boot.log ] && touch /var/log/boot.log pkill -F /run/bootlogd.pid

However, the log is completely empty when the system has fully booted. I at first assumed this was due to bootlogd being killed too early, but even when commenting out pkill, the log is still empty.

cloux commented 5 years ago

Yes, bootlogd can be a little tricky to get right. When you do unexpected things like starting it with wrong parameters, unavailable mountpoints, or killing and restarting it between runlevels, it might simply not work. And surprise surprise, there are no logs about what is going wrong. It might be frustrating, but the solution is mostly very simple, just not very obvious.

Looking at bootlogd source, at the main loop at the bottom there is a call to select to read the console output. If there is nothing coming from the console for 5sec, the select call times out. And then bootlogd attempts to create the file. Writing something to /dev/console, or waiting 5 seconds before killing bootlogd might help.

So, in in /etc/runit/1 you can have /sbin/bootlogd -l /var/log/boot.log, and then in /etc/sv/bootlogd/run:

Option 1

touch /var/log/boot.log
sleep 5
killall -q -e bootlogd

Option 2

touch /var/log/boot.log
printf "Bootlogd takeover\n" >/dev/console
killall -q -e bootlogd

NOTE:

NOTE: The logfile on my system is created properly even without the sleep or printf calls, as long as I create the logfile before I kill bootlogd. This is consistent with the source code: bootlogd appears to handle the signals properly and tries to write the logfile before it terminates. Let me know if it worked for you.

cloux commented 5 years ago

In the case that the above options fail and you still end up with an empty logfile, there might be another problem preventing bootlogd from running in stage 1. To find the issue, you might try this:

sv stop bootlogd
touch /etc/sv/bootlogd/down
rm /var/log/boot.log

which will prevent bootlogd service to start in stage 2 after reboot, and remove the current log so the stage1 bootlogd will have no file to write to.

After reboot:

If this is not the case, the problem is with starting bootlogd in stage 1, and not with stage 2 as expected in my previous comment.

NOTE: don't forget to rm /etc/sv/bootlogd/down when you're done.

Tank-Missile commented 5 years ago

Well, I don't exactly want bootlogd running at all after the initialization of the system. It's more of a personal preference I suppose. Due to that fact, it would be much preferable to place touching the file and killing bootlogd in /etc/runit/2, unless you have a better idea.

cloux commented 5 years ago

Well, this use case is not expected or handled in my code. I expect that the user wants to debug/optimize the system; that includes boot, shutdown and reboot timing. Or the user does not really care as long as the system works. The first "tweaker" use case (me) needs to have bootlogd running continuously to see all potential issues at boot, at shutdown and even possible errors in the 2nd stage that would otherwise remain invisible. The second case "normal user" does not need to run bootlogd at all and can safely deactivate it.

If you only need to optimize the boot process (also what I mostly need), it is still safe to keep bootlogd running in the background. The code is extremely clean and it barely consumes any resources. If you still want to run bootlogd only at boot and then kill it (like sysvinit does) no problem. That's what runit does best - it's flexible. You'll have to change my 1st stage script like this:

Change line in /etc/runit/1 from

if [ -e /etc/service/bootlogd ] && [ -x /sbin/bootlogd ]; then

to

if [ -x /sbin/bootlogd ]; then

and run svdeactivate bootlogd

This ensures that bootlogd starts in stage 1, and disables its supervision in stage 2. NOTE: the task of stopping/killing a running program should not be done by a stage 2 runscript, since killing is a one-time task and does not require supervision. I would recommend my autorun feature for one-time tasks instead. You can create a script in /etc/runit/autorun/ directory, with this content:

#!/bin/sh
sleep 10
touch /var/log/boot.log
killall -q -e bootlogd

Just make the script executable, and you're done. You can make it more verbose than my example, add mountpoint checking instead of a sleep etc. The output of this script will be automatically logged into /var/log/autorun-YOURSCRIPT.log

Let me know if that worked for you.

cloux commented 5 years ago

NOTE: of course the autorun feature expects the /var/log to be mounted by the time the scripts are run. If your /var/log mount is not ready by then, you will have to change or disable the logging, see the autorun script line 22 here: /etc/runit/bootup/20-autorun.sh

Tank-Missile commented 5 years ago

I had a feeling that autorun feature would come in handy. it's essentially /etc/rc.local I assume, but specifically for your implementation. In fact, I could probably achieve the same effect using /etc/rc.local. In any case, I made the changes you provided, but boot.log is still empty. Luckily, all autorun.log files are being generated in the mount point, so that's not a problem.

cloux commented 5 years ago

Yes, the purpose of my autorun is very similar to rc.local, with a few differences in implementation: The rc.local is a file and will be run at the end of each multiuser runlevel. My autorun is a directory, all executable files inside will be run only at the end of the 1st stage, in parallel. This makes task automation and logging much easier. Autorun scripts are independent of each other, so it's easy to copy your autorun script to another system and be sure you didn't overwrite somebody elses customization. You can't be sure if you copy and overwrite a rc.local file.

To your empty boot.log: are you sure bootlogd is able to start in stage1? It is very simple program, all it needs is mounted /dev/pts and executable /sbin/bootlogd. See lines 12-15 in /etc/runit/1.

Try to add this at the beginning of your autorun script:

mountpoint /dev/pts
pgrep -a bootlogd

it should say in your /var/log/autorun-SCRIPT.log file:

/dev/pts is a mountpoint PID /sbin/bootlogd -l /var/log/boot.log

Is it the case?

Tank-Missile commented 5 years ago

It prints the following:

/dev/pts is a mountpoint 198 /sbin/bootlogd -l /var/log/boot.log 408 /bin/sh /etc/runit/autorun/stop-bootlogd

cloux commented 5 years ago

Ok, this seems to be an issue between the kernel and bootlogd. Runit mounts /dev/pts and starts bootlogd as it should. Please post the output of:

uname -a; cat /proc/cmdline; lsb_release -idr
cat /sys/class/tty/console/active
stat -c "%t %T" /dev/console

Then comment the killall-line in your stop-bootlogd script so the bootlogd keeps running. Then reboot and post the output of

ls /dev/pts/
printf "1\n" >/dev/console
printf "2\n" >/dev/tty0
cat /var/log/boot.log
Tank-Missile commented 5 years ago

uname -a; cat /proc/cmdline; lsb_release -idr output:

Linux Alec-Server 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15) x86_64 GNU/Linux
\\@\boot\vmlinuz-4.19.0-4-amd64 root=PARTLABEL=Root rootflags=subvol=@ initrd=@\boot\initrd.img-4.19.0-4-amd64
-bash: lsb_release: command not found

cat /sys/class/tty/console/active output: tty0 stat -c "%t %T" /dev/console output: 5 1 ls /dev/pts/ output: 0 1 ptmx printf "1\n" >/dev/console, printf "2\n" >/dev/tty0, and cat /var/log/boot.log outputs nothing.

Sorry for taking so long to respond. I don't have the lsb-release package installed, although I don't think it's needed for anything important. When I installed devuan, I used debootstrap to create as minimal of a system as I could. Perhaps I missed a few things here and there.

cloux commented 5 years ago

This is really nasty. Your system seems completely OK, there is no obvious issue, at least I don't see why bootlogd is refusing to log the console. It could be that the /dev/console is not available when bootlogd starts. I addressed that in this commit, see etc/runit/1 on line 13. If this doesn't work, I'm out of ideas. Since I can't reproduce the issue it's hard for me to fix it. I will leave this ticket open, please let me know if you manage to solve this.

NOTE: I've split the aws-devuan project into several repositories:

For you, the most interesting will be probably the runit-base, since you're not using AWS. Please open any new issues you might have in this repository.