geckolinux / geckolinux-project

GeckoLinux bug tracker and documentation
https://geckolinux.github.io
208 stars 18 forks source link

Enabled Zstd compression prevents grub from loading new files on btrfs root filesystem #239

Closed greg-weber closed 3 years ago

greg-weber commented 3 years ago

I may be mistaken in my research of this. I hope this is useful. It seems zstd is enabled and /boot was on btrfs without compression being turned off for /boot. System failed after updates where files needed in /boot were not available to grub due to zstd compression being used in new files.

GeckoLinux_ROLLING_Plasma.x86_64-999.210519.0.iso was installed in Virtual-box to test an openSUSE type install with less setup. Virtual machine was rebooted.

Checked the command to update tumbleweed/gecko rolling. zypper dup was completed. I restarted the GeckoLinux Rolling. The Grub configfile didn't load, although grub itself seemed to be in normal mode. Attempting to load the grub configfile wouldn't present grub in menu mode like normal. Attempting to cat the grub configfile revealed an error. Grub error typed out here manually with photo of /etc/fstab and grub error.

grub> cat (hd0,msdos1)/@/.snapshots/1/snapshot/boot/grub2/grub.cfg error: ../../grub-core/fs/btrfs.c:1667:compression type 0x3 not supported.

grub error geckolinux-rolling-plasma-210525

Searching for the error lead to type 0x3 seeming to refer to zstd compression not being supported by grub.

This is where I start to have doubts as while Gecko or perhaps tumbleweed seems to have an issue with this, I know that I'm running grub on btrfs with zstd. I have grub and boot on the btrfs / subvolume with zstd compression enabled. This is on my main laptop with Ubuntu 20.04 LTS having manually setup a partition with luks, btrfs, @ and @home subvolumes, and grub-btrfs-apt-timeshift snapshot integration. When setting it up I chose zstd compression for the workstation and it works no problem. I'm reporting this still as 6 days ago there was a poll for the default filesystem. There is even mention of zstd on btrfs and it seems there is some issue with it. I have my doubt as 1200 downloads later and I don't see a bug report so maybe it is just me ha.

Searched around a bit more and found openSUSE may have disabled zstd. I recall adjusting my partition table for extra space in case grub was a bit big and left some space after master boot record and didn't install to grub to partition.

openSUSE disabled zstd. https://build.opensuse.org/package/view_file/openSUSE:Factory/grub2/0001-btrfs-disable-zstd-support-for-i386-pc.patch?expand=1

I've left the virtual machine intact for now and am making a fresh virtual machine to further test.

Perhaps of note but I expect not to have contributed

(A functional install of openSUSE Leap 15.2 was on virtual hard disk and overwritten with the 3rd option of the 4 available - Phrasing was sort of on the lines of use the space from this prior installation. The subvolumes from Leap are no longer present as expected.)

(A snapshot was attempted from yast in fresh installation. Yast reported an error that there was no snapper config. snapper config was created for root. A snapshot was taken to confirm snapshots were working. I had not used snapper before. 4 more snapshots were created by the zypper dup)

greg-weber commented 3 years ago

I was very tired when looking at this. I didn't really say how much I appreciated the work. I really liked the polish. The fonts and the codecs made the office documents and multimedia instantly usable. I wanted to duplicate it and definitely studied you files things like how you setup the repos with priorities. The less recommended packages seem to make things quite slim and fit. Before you switched to btrfs as the default I had tried Gecko and took many ideas.

If my report turns out to be poo, at least I need to say thanks.

geckolinux commented 3 years ago

Hi there, thanks very much for the report!

I just did a zypper dup on a GeckoLinux ROLLING Gnome installation installed on real hardware (in UEFI mode) the day of the release with Btrfs and Zstd compression, and it still boots correctly from GRUB.

With that being said, I don't know how it can be working. ;-) Now that you mention this issue, I have discovered reports that openSUSE Tumbleweed doesn't work with Zstd compressed /boot because they disabled it. But I understood the patch you found to be referring to 32-bit systems. I very well could be wrong though. So please do let me know if you find anything else. I'll also do some more testing in a fresh VirtualBox.

geckolinux commented 3 years ago

Another update: This bug does appear for me in VirtualBox with a standard BIOS system (EFI not enabled). So it looks like we'll need to move to LZO compression instead of Zstd. I'll release updated ISOs as soon as possible.

Thanks again for reporting this bug.

geckolinux commented 3 years ago

It looks like the issue can be prevented with sudo chattr -R -c +m /boot before running the system update. Can you think of any downsides or future issues with this method? The advantage is that this would still allow for using Zstd compression on the rest of the filesystem.

geckolinux commented 3 years ago

Fixed in the latest release: https://github.com/geckolinux/geckolinux-project/releases/tag/210526.999

Thanks again to @greg-weber for the helpful bug report.

greg-weber commented 3 years ago

Wow you are fast! Yeah I thought about configuring btrfs to not use zstd on /boot. The only disadvantage is experienced users will not be able to use grub to view files. This doesn't seem to be a common use case. Thank you for your excellent work.

I'm glad the bug report was useful. Yeah I noticed the patch wasn't quite the one I thought of, but ran out of time before I found some documentation or such to confirm the guess.

Busy day here again. 8am-630pm then called in till 11pm. Barely had made it home. I had checked the bug report when I got home, but never had a chance to reply before I got called in again. At the time I was mixed up and read it as turning off COW for btrfs instead of compression, oops lol.

greg-weber commented 3 years ago

I think i386-pc refers to grub2 legacy bios boot. the named something like x64-efi is for newer efi boot. Just to make sure this is not confused with whether your running a 32 bit kernel or 64 bit kernel as this is pre-os-kernel in the boot precedure. Just thought that would explain your results booting on efi hardware. I think you probably already knew this.

geckolinux commented 3 years ago

Thanks for that explanation. The terminology revolving around the boot process is dreadfully confusing and arcane with the addition of EFI and GPT partitioning. For example:

https://www.reddit.com/r/openSUSE/comments/k2g6do/btrfs_root_partition_with_zstd_compression_and/gducug7/

We have to be careful to distinguish "bios boot partition", "boot partition", and "/boot partition", which are all different :)

Frankly I resent the added complexity and failure points that Microsoft forced on the PC world with the UEFI thing, and I freely admit I don't understand it very well. So thanks again for catching this and for the explanation.