oddlama / gentoo-install

A gentoo installer with a TUI interface that supports systemd and OpenRC, EFI and BIOS, as well as variable disk layouts using ext4, zfs, btrfs, luks and mdraid.
MIT License
510 stars 72 forks source link

ZFS Support? #116

Closed rharmonson closed 3 weeks ago

rharmonson commented 4 weeks ago

I have reviewed configure, install, gentoo.conf.example, and scripts/function.sh but I cannot tell if zfs is currently supported or, perhaps, I I don't understand how to configure to setup zfs.

The only options being shown using configure is ext4 and btrfs. I found function check_wanted_programs() in the utils.sh and a call to download the archzfs-iso install script. I did look. :D

Your time and guidance is appreciated.

oddlama commented 4 weeks ago

It is, here you can find the list of available layouts in the code.

But with ZFS there are some caveats to watch out for. Notably there have been times where the latest zfs package doesn't support the latest available kernel, which results in compilation errors when compiling the zfs module. There is no simple way to detect this ahead of time, so that must be manually addressed by downgrading the kernel if it occurs.

Secondly you won't have much control over the initial datasets that are created. If you want a different layout, you should adjust it after the installer has finished. This is what you currently get:

https://github.com/oddlama/gentoo-install/blob/039d1a8b3585c067722243cab67bfe5ad4a3b6f3/scripts/functions.sh#L500-L505

rharmonson commented 3 weeks ago

I hear you on the kernel and good call out. In fact, I saw the archiso script fail on installing 'zfs' using the current Arch ISO with kernel 6.10. I will download and test an older ISO from https://archive.archlinux.org/iso/.

Thank you for the timely response and patience. I will update.. shortly.

rharmonson commented 3 weeks ago

Using the February January 2024 Arch ISO appears to be working. However, rebooting results with zfs pool reporting it was not exported and drops to the shell without loading the OS.

Ran the install a again and verified the zfs pool is not being exported (unmounted). My initial thought was it was an easy fix by using zpool export rpool, however, there appears to be a process (unverified) locking the mount resulting in "pool is busy." I could force it but.. not a good idea.

I will keep looking but if you a recommendation love to hear it.

I do want to say that your install script is loads more elegant then my bash install scripts. Mine are really just a list of shell commands. Well, done!

oddlama commented 3 weeks ago

I will keep looking but if you a recommendation love to hear it.

The chrooted system is probably still mounted in the live system. So you need to lazy umount the chrooted system umount -l /tmp/gentoo-install/root (lazy ensures that inner mounts like /sys, /dev, ... are orphaned and also unmounted), before exporting the pool.

My initial thought was it was an easy fix by using zpool export rpool, however, there appears to be a process (unverified) locking the mount resulting in "pool is busy." I could force it but.. not a good idea.

If you have already installed the system, you can just boot into a live system with zfs, force import it and then export it again instead of reinstalling. Force importing is never a dangerous action, all it does is to ignore the host id of the system that last imported it, and the export explicitly allows the next import to succeed on any host.

In this particular case you can also probably force export the pool, because you know that all logical file writes are complete. The only blockers are the inner mounts, so you should be fine.

I do want to say that your install script is loads more elegant then my bash install scripts. Mine are really just a list of shell commands. Well, done!

Thanks, glad it was useful to you!

rharmonson commented 3 weeks ago

I am going to close this issue since I have moved past the initial issue with identifying an Arch ISO that works with gentoo-install but later update with the solution to unmount the pool.

I am moving from hardware to a virtual machine so I can snap the vm in a running state and be able to efficiently test. I have a hypothesis on one of the factors needed to successfully unmount the pool.

rharmonson commented 3 weeks ago

Got rpool to export!

umount -R /tmp/gentoo-install/root
rmdir /tmp/gentoo-install/root
zpool export rpool

My belief is 'rmdir /tmp/gentoo-install/root' removes a blocker that is preventing the export. No idea of root cause, but lsof /tmp/gentoo-install/root shows no open files prior to deletion. Rebooted without being dumped into dracut and login with root. Yeah!

I may investigate some more, later, to see if I can find the root cause but at least I can properly export the zfs pool.

Again, thank you. Have a wonderful day.

oddlama commented 3 weeks ago

You too! I believe lsof doesn't know about kernel processes depending on the mount. If you want to try that again adding -l to umount might solve the problem. If not, I'm not sure either.

rharmonson commented 3 weeks ago

I failed to share with you that I did tryumount -l and it didn't work by itself. When followed by removing "root" directory, both umount -l and umount -R worked.

I did a cursory look at processes but there was so many I shied away, focused elsewhere, then found the above solution. I did find a number of posts where a process was preventing zfs exports. It possible it is a process. Something to explore if I revisit it.

rharmonson commented 3 weeks ago

Update

Wanted to share a couple items.

hostid

Want to share a fix for an annoyance of building gentoo in a chroot using OpenZFS. Not an issue with gentoo-install.

Executing xpool status rpool results with:

status: Mismatch between pool hostid and system hostid on imported pool.

Fix:

  1. rm /etc/hostid
  2. zdb and note the hostid number
  3. Convert to hex using printf "%x\n" [number]
  4. zgenhostid [hexnumber]

This will result with /etc/hostid matching the stored hostid and zpool status no longer complaining.

use flags

I am still figuring this out but I am using the following at this time:

Global: dist-kernel Local: sys-fs/zfs-kmod dist-kernel-cap