NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.05k stars 14.1k forks source link

ZFS SSH decryption with boot.zfs.extraPools is worse with systemd initrd than with scripted intird #291223

Open illode opened 8 months ago

illode commented 8 months ago

Describe the bug

Using the scripted initrd, one can run zpool import -a; zfs load-key -a && killall zfs over SSH to decrypt all pools (not just root) and make the boot process to continue (wiki).

When using the systemd initrd, this doesn't work, and there is no clean replacement that I could find / think of. Non-root pools have to be decrypted using zfs-load key, then systemd-tty-ask-passsword-agent needs to be run for the root pools.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Add additional pools to boot.zfs.extraPools
  2. Use the systemd initrd
  3. Attempt to decrypt over SSH

Expected behavior

All available pools can be painlessly imported & decrypted at boot time over SSH.

Additional context

Running systemd-tty-ask-password-agent will allow the boot to continue, but the SSH server shuts down before the prompt for the remaining pools appears, so the boot process gets stuck waiting for decryption credentials for the extraPools.

Using the old command will:

  1. Complain that the killall command can't be found. Easy fix, but adding it still doesn't fix anything.
  2. Leave the systemd-tty-ask-passsword-agent prompt waiting, sosystemd-tty-ask-password-agent has to be run after unlocking everything anyways.
  3. Cause the boot to fail. Since everything was already decrypted, zfs load-key will complain Key load error: Key already loaded for <poolname>, which makes systemd-tty-ask-password-agent error, which makes the decryption systemd unit fail, which stops the boot process.

Running zpool import -a && zfs load-key -L prompt <extrapool1> <extrapool2> then systemd-tty-ask-password-agent works, but is really clunky + makes it so the command varies per machine.

My (hopefully) temporary workaround is this cumbersome script which loads the keys for all encryption roots except the one systemd-ask-password is trying to decrypt. It also only works if there's a single root pool, and needs grep.

set -eu
zpool import -a || true
PRIMARY_POOL="$(systemd-tty-ask-password-agent --list | cut -f 4 -d " " | cut -f 1 -d ':')"
ENCROOT_LIST="$(zfs list -H -t filesystem -o encroot | grep -v "$PRIMARY_POOL" | sort | uniq)"
[[ -n $ENCROOT_LIST ]] && zfs load-key -L prompt $(echo "$ENCROOT_LIST" | tr "\n" " ")
systemd-tty-ask-password-agent --query

Notify maintainers

@ElvishJerricco

Add a :+1: reaction to issues you find important.

james-atkins commented 3 months ago

Replace killall zfs with systemctl restart zfs-import-${poolname}.service? This should work as the password prompt would be killed and then as the pool has been imported already, the import service will complete successfully and the system will continue to boot.

ElvishJerricco commented 3 months ago

Hm, I think the right solution here would be a boot.initrd.zfs.extraPools option, indicating pools that ought to be imported and have their keys loaded during initrd instead of stage 2. That way you could just use systemd-tty-ask-password-agent to respond to all prompts, wouldn't have to manually import things at all, and importing would work when you want to enter passwords over the console instead of SSH.

illode commented 3 months ago

Replace killall zfs with systemctl restart zfs-import-${poolname}.service?

This almost works, but it depends on:

  1. Knowing the pool name(s) in advance
  2. All targets having the same pools OR changing the decrypt command depending on the target's setup

Some of my machines have more / different pools than others depending on their needs. I would need to create / generate a separate script for each machine, which is unnecessarily cumbersome and complex.

It would definitely be a viable solution for people with simpler setups, though.

illode commented 3 months ago

Hm, I think the right solution here would be a boot.initrd.zfs.extraPools option, indicating pools that ought to be imported and have their keys loaded during initrd instead of stage 2. That way you could just use systemd-tty-ask-password-agent to respond to all prompts, wouldn't have to manually import things at all, and importing would work when you want to enter passwords over the console instead of SSH.

That sounds like the best solution. I can take a stab at implementing it.