coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
264 stars 59 forks source link

[rawhide][x86_64] ext.config.var-mount.luks kola test failure #1836

Open aaradhak opened 12 hours ago

aaradhak commented 12 hours ago

Describe the bug

The ext.config.var-mount.luks kola test failed in the latest rawhide build . The reason for the failure what that the machine entered emergency.target in the initramfs stage causing the test to fail.

kola test failure:

[2024-11-19T13:12:28.282Z] --- FAIL: ext.config.var-mount.luks (24.31s)
[2024-11-19T13:12:28.282Z]         harness.go:1823: mach.Start() failed: machine 28319516-ff0b-4c0f-a1b0-7d1d045204ae entered emergency.target in initramfs
[2024-11-19T13:12:28.282Z] FAIL, output in /home/jenkins/agent/workspace/build/tmp/kola-lf7Xi/kola/rerun
[2024-11-19T13:12:28.282Z] Error: harness: test suite failed
[2024-11-19T13:12:28.282Z] 2024-11-19T13:12:25Z cli: harness: test suite failed
[2024-11-19T13:12:28.282Z] failed to execute cmd-kola: exit status 1

This issue is found to occur after the clevis pkg upgrade from clevis-21-6.fc42 -> 21-7.fc42

clevis (21-6.fc42 → 21-7.fc42)
clevis-dracut (21-6.fc42 → 21-7.fc42)
clevis-luks (21-6.fc42 → 21-7.fc42)
clevis-systemd (21-6.fc42 → 21-7.fc42)

Console log: From the console log, an ignition-disks.service failure seem to have occurred as the Clevis bind operation for setting up LUKS encryption failed due to missing tools in the environment

console.txt

[   16.662679] ignition[871]: disks: createLuks: op(b): [finished] opening luks device varlog
[   16.666771] ignition[871]: disks: createLuks: op(c): [started]  Clevis bind
[   18.695645] ignition[871]: disks: createLuks: op(c): [failed]   Clevis bind: exit status 1: Cmd: "clevis" "luks" "bind" "-f" "-k" "/tmp/ignition-luks-229972637" "-d" "/run/ignition/dev_aliases/dev/disk/by-partlabel/varlog" "sss" "{\"pins\":{\"tpm2\":{}},\"t\":1}" Stdout: "Warning: keyslot operation could fail as it requires more than available memory.\n" Stderr: "/usr/bin/clevis-encrypt-tpm2: line 137: tpm2_getcap: command not found\nUnable to find non-empty PCR algorithm bank, please check output of tpm2_getcap pcrs\nUnable to perform encryption with PIN 'sss' and config '{\"pins\":{\"tpm2\":{}},\"t\":1}'\nError adding new binding to /run/ignition/dev_aliases/dev/disk/by-partlabel/varlog\n"
M
[FAILED] Failed to start ignition-disks.service - Ignition (disks).

See 'systemctl status ignition-disks.service' for details.

[[0;1;38:5:185mDEPEND] Dependency failed for ignition-complete.target - Ignition Complete.

[[0;1;38:5:185mDEPEND] Dependency failed for initrd.target - Initrd Default Target.

[   18.713497] systemd[1]: ignition-disks.service: Main process exited, code=exited, status=1/FAILURE
[   18.715874] ignition[871]: disks failed

Reproduction steps

git checkout rawhide in fedora-coreos-config
cosa fetch && cosa build
kola run ext.config.var-mount.luks

Expected behavior

ext.config.var-mount.luks test to pass

Actual behavior

ext.config.var-mount.luks test fails as the machine enters emergency.target in the initramfs

System details

[rawhide][x86_64]

Butane or Ignition config

No response

Additional information

No response

dustymabe commented 12 hours ago

any relevant logs from the console that indicate why we ended up in emergency.target?

aaradhak commented 12 hours ago

Override PR - https://github.com/coreos/fedora-coreos-config/pull/3267

aaradhak commented 11 hours ago

any relevant logs from the console that indicate why we ended up in emergency.target?

I just updated the description with the relevant log information now. Looks like there was an ignition-disks.service failure caused by clevis bind operation for setting up LUKS encryption.

aaradhak commented 7 hours ago

Filed a bugzilla issue against clevis for this - https://bugzilla.redhat.com/show_bug.cgi?id=2327563

jlebon commented 7 hours ago

I think the fix for this is likely on our side. We probably need to add tpm2_getcap to the initrd. E.g. here: https://github.com/coreos/ignition/blob/7a20ab2b65d8d1e7f58f2205b09172a514734d59/dracut/30ignition/module-setup.sh#L49-L60

aaradhak commented 7 hours ago

I can try to check that change.