random-archer / mkinitcpio-systemd-tool

Provisioning tool for systemd in initramfs (systemd-tool)
https://www.archlinux.org/packages/community/any/mkinitcpio-systemd-tool/
Other
113 stars 27 forks source link

Fail to boot after upgrading to systemd 242.0-1 #25

Closed jichu4n closed 4 years ago

jichu4n commented 5 years ago

Symptom

After upgrading to systemd 242.0-1 (Arch x86_64), I stopped getting a prompt for the disk encryption secret, and instead would only see the prompt menu:

a) secret agent
s) sys shell
r) reboot
q) quit

Digging a bit into the code, it seems like this is caused by has_crypt_jobs returning false. Selecting sys shell and running systemctl list-jobs indeed returns no results.

Setup

My encryption config is specified as a kernel flag in /etc/default/grub:

GRUB_CMDLINE_LINUX="rd.luks.name=xxxx=cryptroot"

I was wondering if this was no longer working for some reason, and tried adding a corresponding line to /etc/mkinitcpio.d/crypttab:

root UUID=xxx none luks

Rebuilding the initramfs from the initramfs:

/usr/lib/systemd/systemd-cryptsetup attach cryptroot /dev/disk/by-uuid/xxx
mkdir /mnt
mount -t ext4 /dev/mapper/cryptroot /mnt
mount -t ext4 /dev/xxx /mnt/boot
mount -o bind /proc /mnt/proc
mount -o bind /sys /mnt/sys
mount -o bind /dev /mnt/dev
chroot /mnt
mkinitcpio -p linux

This did not work either. After a reboot, I still got the prompt menu and systemctl list-jobs still returned no results.

Mitigation

I realized I could get the system to boot by decrypting the root partition quickly from the shell, before the root file system systemd unit times out (90s):

/usr/lib/systemd/systemd-cryptsetup attach cryptroot /dev/disk/by-uuid/xxx

After booting into the normal system, I downgraded back to systemd 241.7-2:

sudo pacman -U /var/cache/pacman/pkg/systemd-241.7-2-x86_64.pkg.tar.xz
sudo pacman -U /var/cache/pacman/pkg/systemd-libs-241.7-2-x86_64.pkg.tar.xz
sudo pacman -U /var/cache/pacman/pkg/systemd-sysvcompat-241.7-2-x86_64.pkg.tar.xz
mkinitcpio -p linux

After a reboot, I'm now again greeted by the secret> prompt and am able to unlock normally as usual.

Andrei-Pozolotin commented 5 years ago

please post back when you find the underlying problem with 242.0-1

jichu4n commented 5 years ago

I don't know much about systemd so not sure where to start :)

No idea if it's actually the culprit, but going through the changelogs for 242.0 I found this commit which seemed the most relevant.

freaknils commented 5 years ago

Same here with cryptosetup in /etc/mkinitcpio.d/cryptsetup.

Andrei-Pozolotin commented 5 years ago
  1. I updated few boxes to systemd 242.0-1 and do NOT see the problem

  2. configuration:

    # uname -a
    Linux serv1 5.0.9-arch1-1-ARCH #1 SMP PREEMPT Sat Apr 20 15:00:46 UTC 2019 x86_64 GNU/Linux
    # systemctl --version
    systemd 242 (242.0-1-arch)
    +PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybri
    # cat /proc/cmdline
    BOOT_IMAGE=/vmlinuz-linux edd=off consoleblank=180 initrd=/initramfs-linux.img
  3. I am using efi/syslinux for boot with default systemd-tool - generated cryptsetup with generated entries that look like this: (using mc to look inside *.img)

initramfs-linux.img/ucpio://etc/crypttab

# <mapper name> <block device> <password/keyfile> <crypto options>
root   UUID=xxxx   none   luks,discard
swap   UUID=xxxx   none   luks,discard

initramfs-linux.img/ucpio://etc/fstab

# <block device>  <mount point> <fs type> <options> <dump> <pass>
/dev/mapper/root   /sysroot   auto   x-systemd.device-timeout=9999h   0   1
/dev/mapper/swap   none       swap   x-systemd.device-timeout=9999h   0   1
  1. so the difference to @jichu4n seems to be:
    • not using grub
    • not using explicit kernel command line for cryptsetup
ArchangeGabriel commented 5 years ago

Might be related: https://bugs.archlinux.org/task/62450

jichu4n commented 5 years ago

I think the key difference may be /etc/mkinitcpio.d/fstab.

As I mentioned above (and IIUC @freaknils also mentioned), just adding the device to /etc/mkinitcpio.d/crypttab does not appear to fix this problem.

I didn't try adding the device to /etc/mkinitcpio.d/fstab because grub-mkconfig automatically generates a root=UUID=XXX kernel flag in /boot/grub/grub.cfg, and it works fine in systemd 241.

$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-linux-lts root=UUID=555437a3-eb44-47f3-9534-7600505cc428 rw rd.luks.name=7487e0df-9693-419f-9fa9-a530ce1b2a30=cryptroot quiet

But I'm wondering if this commit means an entry in /etc/mkinitcpio.d/fstab is now required?

iexos commented 5 years ago

I have exactly the same issue, however I am using arm64 and u-boot.

The following is after decrypting manually from the rescue shell:

# uname -a
Linux rock 5.0.10-1-ARCH #1 SMP Sat Apr 27 16:14:23 UTC 2019 aarch64 GNU/Linux

# cat /proc/cmdline
console=ttyS2,1500000 rd.luks.name=bcefef50-92f0-4217-9905-373fb9c73d2c=cryptroot rw rootwait earlycon=uart8250,mmio32,0xff130000

# systemctl --version
systemd 242 (242.19-1-arch)
+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid

the initrd crypttab is empty and fstab only has:

/dev/mapper/rootgroup-root     /sysroot    auto     x-systemd.device-timeout=9999h     0     1
Andrei-Pozolotin commented 5 years ago
  1. basic workaround for now is to remove any device configuration from kernel options:
    # cat /proc/cmdline
    BOOT_IMAGE=/vmlinuz-linux edd=off consoleblank=180 initrd=/initramfs-linux.img
  2. and configure disk devices only via crypttab/fstab generator:
    /etc/mkinitcpio.d/crypttab
    /etc/mkinitcpio.d/fstab
iexos commented 5 years ago

I couldnt get it to work with that.

I removed the entry from cmdline:

# cat /proc/cmdline
console=ttyS2,1500000 rw rootwait earlycon=uart8250,mmio32,0xff130000

and added a line to /etc/mkinitcpio.d/crypttab:

cryptroot           UUID=bcefef50-92f0-4217-9905-373fb9c73d2c       none                luks

or

root           UUID=bcefef50-92f0-4217-9905-373fb9c73d2c       none                luks

(of course generating new image every time)

edit: it worked when adding instead:

rootgroup-root           UUID=bcefef50-92f0-4217-9905-373fb9c73d2c       none                luks

i was confused since the decrypted volume contains a lvm and not this directly, but of course crypttab doesnt need to know that

edit2: nevermind, this does not work either, as this blocks LVM from doing its thing and actually made things much worse as i could not even recover with the rescue shell anymore

tohojo commented 5 years ago

I've just upgraded and now hit this issue as well...

Digging into it, the problem appears to be that the 'cryptsetup' systemd target is not enabled in the initrd.

I can fix this after the device boots, but logging in, pressing 's' to get a sys shell, then issuing a systemctl start cryptsetup.target. This immediately asks for a password, but also enables everything so that logging in again in another shell gives me the right password prompt.

I couldn't find any obvious way to add the 'enable' when building the initrd; maybe a parameterised version of the InitrdService directive or something?

jichu4n commented 5 years ago

The fix suggested by @Andrei-Pozolotin above (https://github.com/random-archer/mkinitcpio-systemd-tool/issues/25#issuecomment-488327365) worked for me.

For me the steps were:

Hope that helps.

C0rn3j commented 5 years ago

As a note for others I am NOT hitting this issue with LVM on LUKS even with the root defined as a kernel parameter using systemd-boot.

options rd.luks.name=5054b30f-5441-4052-853b-3be38e4a9a33=cryptlvm root=/dev/ArchVol/root rw

systemd 242.32-3
Linux test 5.2.0-arch2-1-ARCH #1 SMP PREEMPT Mon Jul 8 18:18:54 UTC 2019 x86_64 GNU/Linux
HOOKS=(base systemd autodetect keyboard sd-vconsole modconf block sd-encrypt sd-lvm2 filesystems fsck systemd-tool)
[0] # cat /proc/cmdline
initrd=\intel-ucode.img initrd=\initramfs-linux.img rd.luks.name=5054b30f-5441-4052-853b-3be38e4a9a33=cryptlvm root=/dev/ArchVol/root rw

Everything works perfectly, including remote unlocking.

EDIT: internal wifi however breaks on my laptop, made an issue about that.

Flakebi commented 5 years ago

Thanks for sharing your configuration @C0rn3j! I also had the problem that my system did not ask for a password after upgrading to systemd 242. Now I’m using your mkinitcpio hooks and kernel parameters and it finally works.

Update: The hooks are the important part (I’m using cryptsetup and fstab now), maybe the sd-encrypt is needed now?

hv15 commented 5 years ago

I can also confirm that @C0rn3j configuration works for me.

Andrei-Pozolotin commented 5 years ago

to all: status check: as of systemd 242 (242.84-2-arch) is there still a problem?

systemctl --version
systemd 242 (242.84-2-arch)
+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid
C0rn3j commented 5 years ago

That depends whether it is intended that you have to use sd-encrypt sd-lvm2 hooks along with the systemd-tool one or you should only have systemd-tool

iexos commented 5 years ago

Still same

systemd 242 (242.84-2-arch)
+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid

Linux rock 5.1.15-2-ARCH #1 SMP Sat Jun 29 13:22:26 MDT 2019 aarch64 GNU/Linux

HOOKS=(base systemd autodetect keyboard modconf block sd-lvm2 filesystems fsck systemd-tool)

# cat /proc/cmdline
console=ttyS2,1500000 rd.luks.name=bcefef50-92f0-4217-9905-373fb9c73d2c=cryptroot rw rootwait earlycon=uart8250,mmio32,0xff130000
Flakebi commented 5 years ago

@iexos does it work if you add the sd-encrypt hook?

HOOKS=(base systemd autodetect keyboard modconf block sd-encrypt sd-lvm2 filesystems fsck systemd-tool)
iexos commented 5 years ago

That does work!

It yields the following error message: Failed to activate with specified passphrase: Device or resource busy But its working, so I can ignore it

Andrei-Pozolotin commented 4 years ago

closing as obsolete. feel free to re-open if is still relevant