Open hotburger opened 1 year ago
Some experiments to try to collect more information:
btrfs-search-metadata file /path/to/vmlinuz
(from python-btrfs
package) before and after the failure (i.e. once after reinstalling, and once again when boot fails).cp --reflink=always /path/to/vmlinuz /root/foo
and then reboot?I don't know how grub would distinguish one reflink to a file from another, much less be fatally broken by it, so I expect experiment 2 will not trigger a grub failure, and we'll see some anomalous feature (non-zero extent offsets? unsupported compression type? hole in kernel file?) from experiment 1.
Hopefully we get some information that can be turned into an actionable grub bug report.
This issue stopped happening for a while, so I couldn't replicate it to gather info. It is happening again though. Creating a reflink did not cause the boot to fail. vmlinuz-6.2.7-broken.log vmlinuz-6.2.7-fixed.log
Looks like this is fixed in grub but not released yet:
https://git.savannah.gnu.org/cgit/grub.git/commit/?id=7f4e017a1416bcbdca16de4f923679ec9f003171
I had similar boot issues in versions of grub that supposedly have this fixed (it would panic in various random ways), I switched to a 3 partition layout with:
/
btrfs/boot
ext4/boot/efi
vfatWhich works around the problems.
Seems like grub's btrfs implementation is not very good yet.
I have the same problem on manjaro, using kernel 6.6.8-2-MANJARO
and grub 2.12. Before entering the grub menu I get this error:
error: start_image() returned 0x800000000000000001.
Failed to boot both default and fallback entries.
Press any key to continue...
I can get into the grub menu after that, but trying to boot results in error: you need to load the kernel first.
and the system freezes...
I am now successfully using @Jorropo's workaround
I can confirm this on two separate machines running Arch. Here it is usually the amd-ucode.img
that gets broken and gives the error: premature end of file
. The systems boot if i remove it from the boot entry in Grub.
Chrooting into the installation and reinstalling the ucode also fixes it temporarily.
You can set the boot directory chattr +C
before reinstalling the boot loader and see if that helps. bees won't touch file extents created with this flag on, IOW, setting the flag on already existing files changes nothing. New files will inherit the flag from the directory. But this also removes checksum protection from your boot files, so it can only work as a temporary work-around.
GRUB gives me this error after running bees for a few hours. It is consistently doing it every time I run bees for enough time. I fix it temporarily by reinstalling the kernel package from chroot. I'm assuming bees is deduping the kernel, which grub doesn't like? Strangely this doesn't happen to my arch install on the same partition.
I assume that gentoo is storing another copy of the kernel somewhere while arch doesn't. The only other difference from my arch install is a separate subvol for /boot.
grub error message: