Venom1991 / refind-btrfs

Generate rEFInd manual boot stanzas from Btrfs snapshots
GNU General Public License v3.0
146 stars 8 forks source link

Unexpected error after writting to `/root/.refind-btrfs` #3

Closed Th3Whit3Wolf closed 3 years ago

Th3Whit3Wolf commented 3 years ago

After running sudo refind-btrfs I get this output

Initializing the block devices using lsblk.
Initializing the physical partition table for device '/dev/zram0' using lsblk.
Initializing the live partition table for device '/dev/zram0' using findmnt.
Initializing the physical partition table for device '/dev/zram1' using lsblk.
Initializing the live partition table for device '/dev/zram1' using findmnt.
Initializing the physical partition table for device '/dev/zram2' using lsblk.
Initializing the live partition table for device '/dev/zram2' using findmnt.
Initializing the physical partition table for device '/dev/zram3' using lsblk.
Initializing the live partition table for device '/dev/zram3' using findmnt.
Initializing the physical partition table for device '/dev/zram4' using lsblk.
Initializing the live partition table for device '/dev/zram4' using findmnt.
Initializing the physical partition table for device '/dev/zram5' using lsblk.
Initializing the live partition table for device '/dev/zram5' using findmnt.
Initializing the physical partition table for device '/dev/zram6' using lsblk.
Initializing the live partition table for device '/dev/zram6' using findmnt.
Initializing the physical partition table for device '/dev/zram7' using lsblk.
Initializing the live partition table for device '/dev/zram7' using findmnt.
Initializing the physical partition table for device '/dev/nvme0n1' using lsblk.
Initializing the live partition table for device '/dev/nvme0n1' using findmnt.
Found the ESP mounted at '/boot' on '/dev/nvme0n1p1'.
Found the root partition on '/dev/mapper/crypt'.
Found a separate boot partition on '/dev/nvme0n1p1'.
Searching for snapshots of the '@' subvolume in the '/.snapshots' directory.
Found subvolume '@' mounted as the root partition.
Found 4 snapshots of the '@' subvolume.
Searching for the 'refind.conf' file on '/dev/nvme0n1p1'.
Analyzing the 'refind.conf' file.
Found 2 boot stanzas matched with the root partition.
Found 4 snapshots for addition.
Creating the '/root/.refind-btrfs' destination directory with 750 permissions.
Creating a new writable snapshot from the read-only '@snapshots/9/snapshot' snapshot at '/root/.refind-btrfs/rwsnap_2021-01-21_00-00-05_ID316'.
Initializing the static partition table for subvolume '@/root/.refind-btrfs/rwsnap_2021-01-21_00-00-05_ID316' from the '/root/.refind-btrfs/rwsnap_2021-01-21_00-00-05_ID316/etc/fstab' file.
Writing to the '/root/.refind-btrfs/rwsnap_2021-01-21_00-00-05_ID316/etc/fstab' file.
ERROR (refind_btrfs/__init__.py/main): An unexpected error happened, exiting...
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/refind_btrfs/__init__.py", line 77, in main
    exit_code = runner.run()
  File "/usr/lib/python3.9/site-packages/refind_btrfs/console/cli_runner.py", line 48, in run
    if not machine.run():
  File "/usr/lib/python3.9/site-packages/refind_btrfs/state_management/refind_btrfs_machine.py", line 74, in run
    while model.next_state():
  File "/usr/lib/python3.9/site-packages/transitions/core.py", line 393, in trigger
    return self.machine._process(func)
  File "/usr/lib/python3.9/site-packages/transitions/core.py", line 1148, in _process
    return trigger()
  File "/usr/lib/python3.9/site-packages/transitions/core.py", line 411, in _trigger
    return self._process(event_data)
  File "/usr/lib/python3.9/site-packages/transitions/core.py", line 420, in _process
    if trans.execute(event_data):
  File "/usr/lib/python3.9/site-packages/transitions/core.py", line 272, in execute
    self._change_state(event_data)
  File "/usr/lib/python3.9/site-packages/transitions/core.py", line 282, in _change_state
    event_data.machine.get_state(self.dest).enter(event_data)
  File "/usr/lib/python3.9/site-packages/refind_btrfs/state_management/states.py", line 179, in enter
    bootable_snapshot.modify_partition_table_using(
  File "/usr/lib/python3.9/site-packages/refind_btrfs/device/subvolume.py", line 200, in modify_partition_table_using
    device_command.save_partition_table(replacement_partition_table)
  File "/usr/lib/python3.9/site-packages/refind_btrfs/system/fstab_command.py", line 98, in save_partition_table
    modified_mount_options = str(filesystem.mount_options)
  File "/usr/lib/python3.9/site-packages/refind_btrfs/device/mount_options.py", line 80, in __str__
    result[position] = constants.PARAMETERIZED_OPTION_SEPARATOR.join(
IndexError: list assignment index out of range
Venom1991 commented 3 years ago

Ok, its something with the fstab file. Are the fstab contents which you posted here actual (unchanged)?

Unrelated, but having two boot stanzas matched is also a bit weird unless you have multiple kernels installed. Upload the refind.conf as well and I'll have a look.

Th3Whit3Wolf commented 3 years ago

Ok, its something with the fstab file. Are the fstab contents which you posted here actual (unchanged)?

I removed the /btrfs mountpoint, other than that yes.

Unrelated, but having two boot stanzas matched is also a bit weird unless you have multiple kernels installed. Upload the refind.conf as well and I'll have a look.

I do have two kernels installed

menuentry "Arch Linux" {
    icon     /EFI/refind/themes/refind-dreary/icons/os_arch.png
    volume   Arch
    loader   /vmlinuz-linux
    initrd   /initramfs-linux.img
    options  "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@ rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    submenuentry "Boot - terminal" {
        add_options "systemd.unit=multi-user.target"
    }
}

menuentry "Arch Linux - Low Latency" {
    icon     /EFI/refind/themes/refind-dreary/icons/os_arch.png
    volume   Arch
    loader   /vmlinuz-linux-xanmod
    initrd   /initramfs-linux-xanmod.img
    options  "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@ rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    submenuentry "Boot - terminal" {
        add_options "systemd.unit=multi-user.target"
    }
}

include themes/refind-dreary/theme.conf

Should I put the second kernel as a submenuentry to the first entry?

Venom1991 commented 3 years ago

No, that's just fine - this tool will generate two corresponding boot stanzas. You shouldn't have to modify anything. I also plan to add another one once the LTS kernel reaches version 5.10.

You were probably aware of this but just bear in mind that with your setup the kernels themselves aren't versioned so the "loader" and "initrd" fields will remain the same.

If you wish, you can include sub-menus ("Boot - terminal", in your case) as well, the option is in the config file ("false" by default).

I'll try to figure this out and report back here. Sorry for the inconvenience.

Th3Whit3Wolf commented 3 years ago

You were probably aware of this but just bear in mind that with your setup the kernels themselves aren't versioned so the "loader" and "initrd" fields will remain the same.

Yeah, unfortunately that's one of the downsides of using refind and luks encryption right now. Hopefully one day refind will support booting from an encrypted boot partition. Since grub right now has only limited ability to do it with LUKS2 I think best case I'm holding out for a long while.

If you wish, you can include sub-menus ("Boot - terminal", in your case) as well, the option is in the config file ("false" by default).

Thanks for the info!

I'll try to figure this out and report back here. Sorry for the inconvenience.

No trouble at all, I really appreciate you helping to get this working on my system.

Venom1991 commented 3 years ago

By inspecting your fstab I believe I see the problem now. The "subvol" option appears twice (and I expect only one such option) - with and without the "/" prefix (default subvolume's path). Same thing happened to me after using the genfstab script but I removed the redundant "subvol" option (doesn't matter which one, either should be fine) manually from every relevant fstab entry. You can keep the "subvolid" option if you want (I do too) but be aware of this situation.

I'll add a check for that and raise a meaningful exception but for now you can try modifying the fstab. The problem with this approach is that you have to repeat it for every created snapshot (problematic with Snapper's read-only snapshots) - or you can delete all of them and start fresh (after fixing the main fstab file found on the @ subvolume).

Snapper has commands for that (it's on the wiki) but you'll have to use btrfs progs to delete that one writable snapshot which was created in the "/root/.refind-btrfs" directory:

btrfs subvolume delete /root/.refind-btrfs/rwsnap_2021-01-21_00-00-05_ID316
Th3Whit3Wolf commented 3 years ago

By inspecting your fstab I believe I see the problem now. The "subvol" option appears twice (and I expect only one such option) - with and without the "/" prefix (default subvolume's path). Same thing happened to me after using the genfstab script but I removed the redundant "subvol" option (doesn't matter which one, either should be fine) manually from every relevant fstab entry. You can keep the "subvolid" option if you want (I do too) but be aware of this situation.

Is there any benefit to having the subvolid there or is it just more verbose?

I'll add a check for that and raise a meaningful exception but for now you can try modifying the fstab. The problem with this approach is that you have to repeat it for every created snapshot (problematic with Snapper's read-only snapshots) - or you can delete all of them and start fresh (after fixing the main fstab file found on the @ subvolume).

Snapper has commands for that (it's on the wiki) but you'll have to use btrfs progs to delete that one writable snapshot which was created in the "/root/.refind-btrfs" directory:

btrfs subvolume delete /root/.refind-btrfs/rwsnap_2021-01-21_00-00-05_ID316

In case anyone else gets in this situation. After fixing the fstab run this.

cd /.snapshots
for i in $(ls); do; sudo btrfs property set /.snapshots/$i/snapshot ro false; done
for i in $(ls); do; sudo btrfs property get /.snapshots/$i/snapshot; done
for i in $(ls); do; sudo cp -f /etc/fstab /.snapshots/$i/snapshot/etc/fstab; done

It works now! Thank you!!

Venom1991 commented 3 years ago

I really wouldn't know, perhaps the mounting process is a bit quicker when using IDs instead of subvolume names? I suppose that it is easier to identify a subvolume using its own ID but this is just a really wild guess.

Either way, having it defined along with "subvol" might just be completely useless and redundant.

Th3Whit3Wolf commented 3 years ago

Sorry about that turns out it's not quite working yet. refind-btrfs exits successfully but nothing is created inside /root/.refind-btrfs and there are no additional submenuentries.

Ouput of refind-btrfs

Initializing the block devices using lsblk.
Initializing the physical partition table for device '/dev/zram0' using lsblk.
Initializing the live partition table for device '/dev/zram0' using findmnt.
Initializing the physical partition table for device '/dev/zram1' using lsblk.
Initializing the live partition table for device '/dev/zram1' using findmnt.
Initializing the physical partition table for device '/dev/zram2' using lsblk.
Initializing the live partition table for device '/dev/zram2' using findmnt.
Initializing the physical partition table for device '/dev/zram3' using lsblk.
Initializing the live partition table for device '/dev/zram3' using findmnt.
Initializing the physical partition table for device '/dev/zram4' using lsblk.
Initializing the live partition table for device '/dev/zram4' using findmnt.
Initializing the physical partition table for device '/dev/zram5' using lsblk.
Initializing the live partition table for device '/dev/zram5' using findmnt.
Initializing the physical partition table for device '/dev/zram6' using lsblk.
Initializing the live partition table for device '/dev/zram6' using findmnt.
Initializing the physical partition table for device '/dev/zram7' using lsblk.
Initializing the live partition table for device '/dev/zram7' using findmnt.
Initializing the physical partition table for device '/dev/nvme0n1' using lsblk.
Initializing the live partition table for device '/dev/nvme0n1' using findmnt.
Found the ESP mounted at '/boot' on '/dev/nvme0n1p1'.
Found the root partition on '/dev/mapper/crypt'.
Found a separate boot partition on '/dev/nvme0n1p1'.
Searching for snapshots of the '@' subvolume in the '/.snapshots' directory.
Found subvolume '@' mounted as the root partition.
Found 5 snapshots of the '@' subvolume.
Searching for the 'refind.conf' file on '/dev/nvme0n1p1'.
Found 2 boot stanzas matched with the root partition.
WARNING: No changes were detected, aborting...

Here's the last 24 lines of refind.conf

menuentry "Arch Linux" {
    icon     /EFI/refind/themes/refind-dreary/icons/os_arch.png
    volume   Arch
    loader   /vmlinuz-linux
    initrd   /initramfs-linux.img
    options  "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@ rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    submenuentry "Boot - terminal" {
        add_options "systemd.unit=multi-user.target"
    }
}

menuentry "Arch Linux - Low Latency" {
    icon     /EFI/refind/themes/refind-dreary/icons/os_arch.png
    volume   Arch
    loader   /vmlinuz-linux-xanmod
    initrd   /initramfs-linux-xanmod.img
    options  "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@ rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    submenuentry "Boot - terminal" {
        add_options "systemd.unit=multi-user.target"
    }
}

include themes/refind-dreary/theme.conf
include btrfs-snapshot-stanzas/arch_vmlinuz-linux.confinclude btrfs-snapshot-stanzas/arch_vmlinuz-linux-xanmod.conf
Th3Whit3Wolf commented 3 years ago

Output of cat /boot/EFI/refind/btrfs-snapshot-stanzas/arch_vmlinuz-linux.conf

menuentry "Arch Linux (rwsnap_2021-01-21_20-00-01_ID332)" {
    icon /EFI/refind/themes/refind-dreary/icons/os_arch.png
    volume Arch
    loader /vmlinuz-linux
    initrd /initramfs-linux.img
    options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@/root/.refind-btrfs/rwsnap_2021-01-21_20-00-01_ID332 rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    submenuentry "Arch Linux (rwsnap_2021-01-21_00-00-05_ID316)" {
        options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@snapshots/9/snapshot rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    }
    submenuentry "Arch Linux (rwsnap_2021-01-20_01-00-13_ID287)" {
        options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@snapshots/3/snapshot rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    }
    submenuentry "Arch Linux (rwsnap_2021-01-19_10-00-34_ID275)" {
        options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@snapshots/2/snapshot rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    }
    submenuentry "Arch Linux (rwsnap_2021-01-19_09-44-47_ID274)" {
        options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@snapshots/1/snapshot rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    }
}

Output of cat /boot/EFI/refind/btrfs-snapshot-stanzas/arch_vmlinuz-linux-xanmod.conf

menuentry "Arch Linux - Low Latency (rwsnap_2021-01-21_20-00-01_ID332)" {
    icon /EFI/refind/themes/refind-dreary/icons/os_arch.png
    volume Arch
    loader /vmlinuz-linux-xanmod
    initrd /initramfs-linux-xanmod.img
    options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@/root/.refind-btrfs/rwsnap_2021-01-21_20-00-01_ID332 rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    submenuentry "Arch Linux - Low Latency (rwsnap_2021-01-21_00-00-05_ID316)" {
        options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@snapshots/9/snapshot rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    }
    submenuentry "Arch Linux - Low Latency (rwsnap_2021-01-20_01-00-13_ID287)" {
        options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@snapshots/3/snapshot rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    }
    submenuentry "Arch Linux - Low Latency (rwsnap_2021-01-19_10-00-34_ID275)" {
        options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@snapshots/2/snapshot rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    }
    submenuentry "Arch Linux - Low Latency (rwsnap_2021-01-19_09-44-47_ID274)" {
        options "rd.luks.name=e4dca43a-21bd-4598-88fc-371dd20695a4=crypt root=/dev/mapper/crypt rootflags=subvol=@snapshots/1/snapshot rw quiet nmi_watchdog=0 kernel.unprivileged_userns_clone=0 net.core.bpf_jit_harden=2 apparmor=1 lsm=lockdown,yama,apparmor systemd.unified_cgroup_hierarchy=1 add_efi_memmap initrd=\amd-ucode.img"
    }
}
Venom1991 commented 3 years ago

A newline is missing here as there are two include's:

include btrfs-snapshot-stanzas/arch_vmlinuz-linux.confinclude btrfs-snapshot-stanzas/arch_vmlinuz-linux-xanmod.conf

Before I fix this issue just split the line yourself but don't delete it. This include step is done only initially (during the first run).

~~As for the writable snapshots, I believe these were cached and not recreated. Delete the contents of the /var/lib/refind-btrfs directory (should be one binary file) and try running it again.~~

EDIT: Don't do that, you made the Snapper's snapshots writable yourself so my tool didn't create new ones from these.

Th3Whit3Wolf commented 3 years ago

That did it! I just booted off a snapshot.

Venom1991 commented 3 years ago

That's great! If you want, you could also verify this by running:

findmnt /

just to make sure that / is actually a snapshot.

So, I suggest that you open another issue concerning the output of "include" directives to "refind.conf" (the missing newline). You can also start the service and expect writable snapshots to appear in the "/root/.refind-btrfs" directory once Snapper does its thing.

Finally, I had a look at the design of the theme that you're using. Does this theme even show what you're booting into like the default theme does?

Th3Whit3Wolf commented 3 years ago

Output of findmnt /

TARGET SOURCE                                                                    FSTYPE OPTIONS
/      /dev/mapper/crypt[/@/root/.refind-btrfs/rwsnap_2021-01-21_20-00-01_ID332] btrfs  rw,noatime,compress-force=zstd:3,ssd,discard=async,space_cache=v2,autodefrag,subvolid=333,subvol=/@/root/.refind-btrfs/rwsnap_2021-01-21_20-00-01_ID332

So, I suggest that you open another issue concerning the output of "include" directives to "refind.conf" (the missing newline). You can also start the service and expect writable snapshots to appear in the "/root/.refind-btrfs" directory once Snapper does its thing.

Will do.

Finally, I had a look at the design of the theme that you're using. Does this theme even show what you're booting into like the default theme does?

It does if you edit it's theme.conf to not hide it

Venom1991 commented 3 years ago

Oh yeah, I almost forgot about the warning you've encountered. That's an expected outcome as no snapshots were found for either addition or deletion. Snapshots in the "/root/.refind-btrfs" will eventually get deleted which depends on the "count" option in the config file. If you want them to remain there indefinitely (for whatever reason(s)) you can add their UUIDs to the "exclusion_list" array. It's explained in the config file's comments.