Antynea / grub-btrfs

Include btrfs snapshots at boot options. (Grub menu)
GNU General Public License v3.0
707 stars 74 forks source link

grub-btrfs-overlayfs hook fails if the systemd hooks are used (instead of udev) #199

Open JordanViknar opened 2 years ago

JordanViknar commented 2 years ago

Hello, first, I should probably state what my configuration is :

When trying to boot to one of the snapshots from the GRUB menu, GDM doesn't start because the filesystem is still locked in read-only mode, despite the proper grub-btrfs-overlayfs hook being used in mkinitcpio.conf. It's as if the hook wasn't there at all, which would be the expected behavior if /boot was part of the BTRFS partition... except it isn't.

Here's the contents of my mkinitcpio.conf : MODULES=(i915) HOOKS="base systemd sd-plymouth autodetect sd-vconsole modconf block filesystems grub-btrfs-overlayfs" COMPRESSION="lz4" COMPRESSION_OPTIONS="-9"

Could I please have some help for solving this issue ? I suspect the problem is related to the systemd hook, or the storage being an eMMC, or perhaps the zRAM's size.

JordanViknar commented 2 years ago

Forgot to mention, but (ignoring the snapshot subvolumes), the only subvolume on that BTRFS system is the root itself.

northfacts commented 2 years ago

just a quick question, did you rebuild the init after adding the hook grub-btrfs-overlayfs?

JordanViknar commented 2 years ago

just a quick question, did you rebuild the init after adding the hook grub-btrfs-overlayfs?

Yes, of course.

JordanViknar commented 2 years ago

I found out the issue also happens on my other laptop, which is Manjaro based, and relies on an Intel RST/Optane RAID accidentally disguised as a normal drive (no idea why it works, but it does, when I should normally be using mdadm I suppose).

It also has the systemd hook, and the zRAM is also the size of the entire RAM.

JordanViknar commented 2 years ago

I think the issue is related to the systemd hook. That would make more sense than zRAM causing the issue, since zram-generator isn't part of the initramfs. I'll try again with the udev hooks instead of the systemd hooks.

JordanViknar commented 2 years ago

I can now confirm the issue is related to the systemd hook. I set up the udev hooks on my main laptop, and that fixed the problem. Unfortunately, I prefer using the systemd hooks, so I can't close this issue or mark it as solved. I'm going to change the title to reflect the problem, now that I know what's causing it.

JordanViknar commented 2 years ago

For more details about those systemd hooks, check this page from the Arch Wiki : https://wiki.archlinux.org/title/Mkinitcpio#Common_hooks

JordanViknar commented 2 years ago

Additionally, I noticed some weird behavior unrelated to this issue : all of my snapshots manage to boot with the systemd hook if GDM is disabled (via kernel parameter) despite being read-only, but some (especially older ones) don't with the udev hook despite actually being read-write with the overlay, I suppose because of some kind of kernel mismatch.

JordanViknar commented 2 years ago

Quote from the Arch Wiki, which explains why using the systemd hook causes this :

Runtime hooks are only used by busybox init. systemd hook triggers a systemd based init, which does not run any runtime hooks but uses systemd units instead.

JordanViknar commented 2 years ago

Current workaround I've been using for now : I'm booting on snapshots using a separate initramfs preset that relies on the udev hook instead of the systemd hook. It is not a perfect solution though : I think the proper fix would be to implement a proper systemd unit, along with a separate sd-grub-btrfs-overlayfs hook like many programs (such as Plymouth) do out there.

Labaman commented 1 year ago

Similar problem. Are you expecting any progress on a solution?

mapleroyal commented 1 year ago

@JordanViknar Have you heard any updates on this?

@Antynea is this the same issue I'm experiencing here?

I don't know what udev has to do with any of this or why, if udev DOES have something to do with it, that isn't stated anywhere in the arch wiki docs, or the readme here, or in any online guide I've read/watched. I'm super confused and it's really frustrating. Is this just totally random? It's happening on 2 separate machines (bare metal) and a VM on each machine.

Antynea commented 1 year ago

No, it isn't.

Zesko commented 1 year ago

I can confirm that systemd hook is causing this issue. Switch to udev in HOOKS to fix the issue.

But KDE SDDM allows you booting into read-only snapshot without overlayfs. Other display managers GDM and LightDM do not.

flobsh commented 5 months ago

Hi, thank you for posting the issue @JordanViknar, I encoutered the same problem recently while reinstalling my Arch setup :grinning: I added a warning in ArchWiki to inform users about it: https://wiki.archlinux.org/index.php?title=Snapper&diff=prev&oldid=803868

Zesko commented 5 months ago

There is an alternative to mkinitcpio: dracut which should work with systemd module + grub-btrfs-overlayfs.

Two script files for Dracut must be created manually for this function:

  1. Create a file: /usr/lib/dracut/modules.d/91btrfs-snapshot-overlay/module-setup.sh
    
    #!/usr/bin/bash

called by dracut

check() { dracut_module_included btrfs || return 1 return 0 }

called by dracut

depends() { return 0 }

called by dracut

install() { inst mktemp hostonly='' instmods overlay inst_hook pre-pivot 000 "$moddir/snapshot-overlay.sh" }


2. Create a file: `/usr/lib/dracut/modules.d/91btrfs-snapshot-overlay/snapshot-overlay.sh`

```sh
#!/usr/bin/bash
function mount_snapshot_overlay() {
    local root_mnt="$NEWROOT"
    if [[ "$(findmnt --mountpoint "$root_mnt" -o FSTYPE -n)" = "btrfs" ]] && [[ "$(btrfs property get ${root_mnt} ro)" != "ro=false" ]]; then
        local ram_dir=$(mktemp -d -p /)
        mount -t tmpfs cowspace ${ram_dir}
        mkdir -p ${ram_dir}/{upper,work}
        mount -t overlay -o lowerdir=${root_mnt},upperdir=${ram_dir}/upper,workdir=${ram_dir}/work rootfs ${root_mnt}
    fi
}

mount_snapshot_overlay
otisdog8 commented 4 months ago

I was able to get this working without using dracut (and still using systemd).

It basically involves creating a systemd service goes in the initrd, and messes with sysroot. This might cause other issues with how your system is setup, it did cause other service failures with mine but I was able to boot into my laptop fine.

This file is the script that gets run in the initrd:

/usr/local/bin/overlayfs-setup

#!/bin/bash

root_mnt="/sysroot"
current_dev=$(findmnt -n -o SOURCE /sysroot | sed 's@\[/.*@@g')

# Checking if /sysroot is a Btrfs filesystem and mounted read-only
#if findmnt -n -o FSTYPE /sysroot | grep -q 'btrfs' && findmnt -n -o OPTIONS /sysroot | grep -q 'ro,'; then
if [[ $(blkid "${current_dev}" -s TYPE -o value) = "btrfs" ]] && [[ $(btrfs property get ${root_mnt} ro) != "ro=false" ]]; then
    # Setting up directories for overlay
    echo "1"
    mkdir -p /mnt/overlay
    echo "2"
    mount -t tmpfs tmpfs /mnt/overlay
    echo "3"
    mkdir -p /mnt/overlay/upper
    wcho "4"
    mkdir -p /mnt/overlay/work
    echo "5"

    # Mounting overlay
    mount -t overlay overlay -o lowerdir=/sysroot,upperdir=/mnt/overlay/upper,workdir=/mnt/overlay/work /sysroot
    echo "OverlayFS mounted on /sysroot."
else
    echo "/sysroot is not a read-only Btrfs filesystem."
fi

You also have to create a service, not in /etc (maybe it would work if you put it there: I didn't try):

/usr/lib/systemd/system/overlayfs-setup.service

[Unit]
Description=Setup OverlayFS on Root Filesystem
DefaultDependencies=no
After=initrd-fs.target
Before=initrd.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/overlayfs-setup
RemainAfterExit=yes

[Install]
WantedBy=initrd.target

You can't enable this normally, you have to use:

ln -s /usr/lib/systemd/system/overlayfs-setup.service /usr/lib/systemd/system/initrd.target.wants/overlayfs-setup.service

for some reason. I think if you enable it the other way the symlink is not included in initramfs.

You need to create the hooks: /etc/initcpio/hooks/sd-overlayfs

#!/usr/bin/ash
run_hook() {
}

/etc/initcpio/install/sd-overlayfs


build() {
    add_module btrfs
    add_module overlay
    add_binary btrfs
    add_binary btrfsck
    add_binary blkid
    add_binary findmnt
    add_binary bash
    add_systemd_unit overlayfs-setup.service
    add_runscript
}

Finally, you can add it to your hooks:

HOOKS=(base systemd btrfs autodetect microcode modconf kms keyboard keymap sd-vconsole block sd-encrypt filesystems fsck sd-overlayfs plymouth)

This is what I use, I'm not sure if the ordering of sd-overlayfs matters though.

If I have time I might try to get these fixes upstreamed, but they might be difficult to package.

bkmo commented 2 months ago

I was able to get this working without using dracut (and still using systemd).

This did work for me but you MUST have the "base" hook in order for it to work. Base hook is not needed with the systemd hook, but must be present for these hooks to work.