nix-community / disko

Declarative disk partitioning and formatting using nix [maintainers=@Lassulus @Enzime]
MIT License
1.73k stars 190 forks source link

Script not working under hetzner rescue system #815

Open yajo opened 2 days ago

yajo commented 2 days ago

I found a "funny" issue.

When running disko-install, there are some calls to the zfs binary.

It turns out that, on a Hetzner rescue system (the one used in Hetzner Cloud/Robot for installing other systems on the machine), zfs is not available per se, but there exists a different zfs binary that prints a warning to you and asks you to press y to install zfs itself.

See this shell session:

root@rescue / # zdb --help
The Hetzner Rescue System does not come with preinstalled ZFS support,
however, we will attempt to compile and install the latest release for you.
Please read the information below thoroughly before entering any response.

ATTENTION

This script will attempt to install the current OpenZFS release
which is available in the OpenZFS git repository to the Rescue
System. If this script fails, do not contact Hetzner Support, as
it is provided AS-IS and Hetzner will not support the installation
or usage of OpenZFS due to License incompatiblity (see below).
Due to github.com limitations, this script only works via IPv4.

Licenses of OpenZFS and Linux are incompatible

OpenZFS is licensed under the Common Development and Distribution
License (CDDL), and the Linux kernel is licensed under the GNU General
Public License Version 2 (GPL-2). While both are free open source
licenses they are restrictive licenses. The combination of them causes
problems because it prevents using pieces of code exclusively available
under one license with pieces of code exclusively available under the
other in the same binary.

Please be aware that distributing of the binaries may lead to infringing.

Press y to accept this.

Installation logs when I press y: zfs-press-y-install-on-hetzner-rescue.log

After installation, zfs seems to finally work:

root@rescue /nix/store/3ys93zsghig63qpcs67vc37x0rwn1zhl-disk-deactivate # zfs --help
usage: zfs command args ...
[...]

So, when running disko-install --flake ..., I got errors like this:

+ lsblk --output-all --json
+ bash -x
++ dirname /nix/store/3ys93zsghig63qpcs67vc37x0rwn1zhl-disk-deactivate/disk-deactivate
+ jq -r --arg disk_to_clear /dev/sda -f /nix/store/3ys93zsghig63qpcs67vc37x0rwn1zhl-disk-deactivate/disk-deactivate.jq
+ set -fu
++ type zdb
++ zdb -l /dev/sda1
++ sed -nr 's/ +name: '\\''(.*)'\\''/\\1/p'
+ zpool=
bash: line 3: syntax error near unexpected token `then'
bash: line 3: `f [[ -n \"${zpool}\" ]]; then zpool destroy -f \"$zpool\"; zpool labelclear -f \"$zpool\"; fi'

Which seem to related to this: https://github.com/nix-community/disko/blob/d39ee334984fcdae6244f5a8e6ab857479cbaefe/disk-deactivate/disk-deactivate.jq#L17-L18

But in reality they are related to the fact that zfs is not the zfs that Disko is expecting.

I can automate this with a simple yes | zfs command before running disko-install, but I wonder why is Disko using the preinstalled zfs binary. Shouldn't it use the one brought by nixpkgs? This would make the script more predictable.

@moduon MT-7504

iFreilicht commented 1 day ago

Yes, disko usually sets up a modified PATH in the scripts it runs:

$ nix build '.#nixosConfigurations.junction.config.system.build.diskoScript' && head -3 result
#! /nix/store/izpf49b74i15pcr9708s3xdwyqs4jxwl-bash-5.2p32/bin/bash
export PATH=/nix/store/x8jzsy0y1zk30mcvav2rh6lrw1gbzzy3-jq-1.7.1-bin/bin:/nix/store/31yx7grsg9qwywd85ci3yy2xvqh8a1ng-gptfdisk-1.0.10/bin:/nix/store/zdlkg4swdw4smrq2xkmkanh93y84m3id-systemd-minimal-256.4/bin:/nix/store/gyxcg3xlfjjbcj4ryg9sx4r2bmgk6lbh-parted-3.6/bin:/nix/store/f3mrhapkqr1lds8x58fh6rwm1lwh8y8c-util-linux-2.39.4-bin/bin:/nix/store/vsyc8jhsr4d9lm2r8yqq9n3j4i66inlj-gnugrep-3.11/bin:/nix/store/5sacm5pwy33dwwak0ffbggs9724a04ni-dosfstools-4.2/bin:/nix/store/xvfcgg85abpndm1j4c2rajm5gy2c74yi-e2fsprogs-1.47.1-bin/bin:/nix/store/w3glp3899fac65gq9b4apbfzfil761md-zfs-user-2.2.6/bin:/nix/store/3rkmqbpa9x1cq16i7yz1rjl02z6i6p61-coreutils-full-9.5/bin:/nix/store/izpf49b74i15pcr9708s3xdwyqs4jxwl-bash-5.2p32/bin:$PATH
umount -Rv "/mnt" || :

This contains /nix/store/w3glp3899fac65gq9b4apbfzfil761md-zfs-user-2.2.6/bin.

disko-install itself only needs a few dependencies, and it declares those as well in package.nix, but all it does is build the diskoScript and run it. This does eventually lead to disk-deactivate being executed, but this should be within the environment set up by diskoScript.

So, when running disko-install --flake ..., I got errors like this:

Is that all? Or did you pass additional arguments to it? How did you make disko-install available on the rescue system?

yajo commented 1 day ago

This contains /nix/store/w3glp3899fac65gq9b4apbfzfil761md-zfs-user-2.2.6/bin.

Weird, mine doesn't! The equivalent command:

#! /nix/store/5jw69mbaj5dg4l2bj58acg3gxywfszpj-bash-5.2p26/bin/bash
export PATH=/nix/store/9rxqymz0cb33lix6l2vwhhy0rkjfv4dv-jq-1.7.1-bin/bin:/nix/store/yyf6x0y4vh1zv4nxkjcl09fy7y0jgkh3-gptfdisk-1.0.10/bin:/nix/store/h12bs6r0d1k9xz19cqg05safif06iajx-systemd-minimal-255.6/bin:/nix/store/wcvkkpxx0qknf4icjvlsl9b0lwraslfs-parted-3.6/bin:/nix/store/myisbm1x1lzqnywx92bs2wmvmargxz1g-util-linux-2.39.4-bin/bin:/nix/store/d9xr7s3z0r8rf0ba22q6ilqv68agymdb-gnugrep-3.11/bin:/nix/store/4w8y5r9ds942120v6w16myvk8la9f0ia-dosfstools-4.2/bin:/nix/store/yia51klf5wwjd3p1qkx5wmy3qm5h6iiq-xfsprogs-6.6.0-bin/bin:/nix/store/bimnr5rdv0slzzc13p5h2p2wnkd8yb9d-coreutils-full-9.5/bin:/nix/store/5jw69mbaj5dg4l2bj58acg3gxywfszpj-bash-5.2p26/bin:$PATH
umount -Rv "/mnt" || :

Why do I not have zfs there? 🤔

Is that all? Or did you pass additional arguments to it?

Yes, I was just omitting them. This is the full command:

nix --accept-flake-config --extra-experimental-features 'nix-command flakes' run "/run/os-flake#disko-install" -- --flake "/run/os-flake#my-host" --disk main /dev/sda --write-efi-boot-entries

How did you make disko-install available on the rescue system?

I rsync the flake to the system. Then I build from there, because my flake re-exposes it. The relevant part from the flake:

# ...
        packages = lib.optionalAttrs pkgs.stdenv.isLinux {
          inherit (inputs.disko.packages.${system}) disko-install;
        };
# ...
yajo commented 1 day ago

Could the problem be that I'm not going through the next line, which supposedly adds the zfs dependency, because my disk configurations involve no zfs stuff? I'm only using vfat and xfs.

https://github.com/nix-community/disko/blob/d39ee334984fcdae6244f5a8e6ab857479cbaefe/lib/types/zfs.nix#L51

iFreilicht commented 13 hours ago

Ahhh that is a very good observation! diskoScript takes its dependencies from the _packages option, while destroyScript hardcodes them, even though diskoScript eventually calls the same _destroy option.

If you run nix build '.#nixosConfigurations.<host>.config.system.build.destroyScript' && head -3 result, zfs should show up there, in contrast to your previous invocation.

This is very clearly a bug. Feel free to submit a PR if you want to. If not that's fine as well, I'll get around to it.