latchset / clevis

Automated Encryption Framework
GNU General Public License v3.0
886 stars 100 forks source link

clevis-dracut fails on tpm2 decryption: tpm2_pcrlist: command not found #74

Closed fchiacchiaretta closed 5 years ago

fchiacchiaretta commented 5 years ago

Hi, I'm using Fedora 29 Beta and trying to setup luks v1 disk decryption at boot with clevis tpm2 module. After fixing module-setup.sh as per issue #73 (removing clevis-decrypt-http and adding cryptsetup) I found the following error at boot:

dracut-initqueue[402]: /usr/bin/clevis-decrypt-tpm2: line 40: tpm2_pcrlist: command not found

So I added the following and regenerate initramfs:

    for cmd in clevis-decrypt-tpm2 \
    tpm2_createprimary \
    tpm2_unseal \
    tpm2_load \
    tpm2_pcrlist; do

    if ! find_binary "$cmd" &>/dev/null; then
        ((ret++))
    fi
    done

    if (($ret == 0)); then
    inst_multiple clevis-decrypt-tpm2 \
        tpm2_createprimary \
        tpm2_unseal \
        tpm2_load \
        tpm2_pcrlist
    fi

Now I'm stuck with the following error:

dracut-initqueue[402]: Creating TPM2 primary key failed!

Binding volume works properly. Currently I'm dual booting with Windows 10, with Bitlocker active, is this setup unsupported?

Best, Federico Chiacchiaretta

martinezjavier commented 5 years ago

@fchiacchiaretta does the following work for you?

$ echo test | sudo clevis encrypt tpm2 '{}' | sudo clevis decrypt
test

If that's works then the problem isn't neither in clevis nor in the tpm2 stack.

I'm not familiar with BitLocker, do you know if Windows attempts to clear the tpm2 device?

That would be a problem since it will create a different seed than the one used to derivate the primary key when the LUKS key was sealed with the tpm2 during the volume binding.

fchiacchiaretta commented 5 years ago

Hi @martinezjavier, thanks for your feedback. The test case you suggested works properly. I also did another test: since I have swap partition encrypted, I tried to umount and lock it and the run

$ sudo clevis luks unlock -d /dev/sdaX

Swap partition gets unlocked and automatically mounted, so there should be no problem with tpm2 stack neither Windows should be involved here. Is there any difference between running unlock via CLI on running system and what clevis hook does during boot?

fchiacchiaretta commented 5 years ago

I did another test, I edited /usr/bin/clevis-decrypt-tpm2 and changed the tpm2_createprimary to this (removed -Q and redirect to /dev/null)

if ! tpm2_createprimary -H "$auth" -g "$hash" -G "$key" \
     -C $TMP/primary.context; then
    echo "Creating TPM2 primary key failed!" >&2
    exit 1
fi

After regenerating initramfs, this is the out during boot:

dracut-initqueue[403]: ERROR: Could not dlopen library: "device"
dracut-initqueue[403]: ERROR: Could not load tcti, got: "device"
dracut-initqueue[403]: Creating TPM2 primary key failed!

Maybe something else is missing in module-setup.sh, just guessing.

fchiacchiaretta commented 5 years ago

I went on troubleshooting this, I added the following to module-setup.sh :

    inst_multiple /etc/services \
        cryptsetup \
        clevis-decrypt-tang \
        clevis-decrypt-sss \
        /usr/libexec/clevis-luks-askpass \
        clevis-decrypt \
        luksmeta \
        clevis \
        mktemp \
        curl \
        jose \
        nc \
    ncat \
    /usr/lib64/libtss2-tcti-device.so.0 \
    /usr/lib64/libtss2-tcti-device.so.0.0.0 \
    /usr/lib64/libtss2-tcti-mssim.so.0 \
    /usr/lib64/libtss2-tcti-mssim.so.0.0.0

Last 5 rows: ncat is needed because nc is just a symlink to ncat (I was getting "dracut-initqueue[403]: Ncat: No such file or directory." ), while libtss2 entries are needed to for device communication with tpm2 (I don't know if libtss2-tcti-mssim is actually needed, didn't test without it).

Decryption at boot is still not working: now I get

[25.547841] dracut-initqueue[392]: Job for systemd-cryptsetup@luks-id.service failed because a fatal signal was delivered causing the control process to dump core.
[25.547897] dracut-initqueue[392]: See "systemctl status "systemd-cryptsetup@luks-id.service"" and "journalctl -xe" for details.
[33.626092] dracut-initqueue[392]: Ncat: Connection refused.
[37.431380] dracut-initqueue[392]: Ncat: Connection refused.
[41.218257] dracut-initqueue[392]: Ncat: Connection refused.
[...]

Last message repeats endlessly.

fchiacchiaretta commented 5 years ago

Hi, sorry for multiple posts, I finally managed to get this working, so I prefer explaining this in detail in a new post. I had to solve 2 main problems:

  1. module-setup.sh needs some modifications, patch below

    --- module-setup.sh.in  2018-10-12 13:03:31.553009896 +0200
    +++ module-setup.sh.in.new  2018-10-12 13:06:04.449943709 +0200
    @@ -36,7 +36,6 @@
     inst_hook initqueue/settled 60 "$moddir/clevis-hook.sh"
    
     inst_multiple /etc/services \
    -        clevis-decrypt-http \
         clevis-decrypt-tang \
         clevis-decrypt-sss \
         @libexecdir@/clevis-luks-askpass \
    @@ -46,12 +45,18 @@
         mktemp \
         curl \
         jose \
    -        nc
    +        nc \
    +        cryptsetup \
    +   /usr/lib64/libtss2-tcti-device.so.0 \
    +   /usr/lib64/libtss2-tcti-device.so.0.0.0 \
    +   /usr/lib64/libtss2-tcti-mssim.so.0 \
    +   /usr/lib64/libtss2-tcti-mssim.so.0.0.0
    
     for cmd in clevis-decrypt-tpm2 \
    tpm2_createprimary \
    tpm2_unseal \
    -   tpm2_load; do
    +   tpm2_load \
    +   tpm2_pcrlist; do
    
    if ! find_binary "$cmd" &>/dev/null; then
        ((ret++))
    @@ -62,7 +67,8 @@
    inst_multiple clevis-decrypt-tpm2 \
        tpm2_createprimary \
        tpm2_unseal \
    -       tpm2_load
    +       tpm2_load \
    +       tpm2_pcrlist
     fi
    
     dracut_need_initqueue
  2. My partition layout was:

Using this layout, with everything about clevis, luks and tpm2 properly set up, I managed to get root and swap partition unlocked and mounted, but home partition always failed: as far as I understand systemd started home partition unlock and mount at a stage where /usr/libexec/clevis-luks-askpass could not be executed anymore. Adding rd.luks.uuid for home partition on kernel cmdline did not help.

So the solution was switching to a single /dev/sda3 encrypted LVM partition with root, home and swap configured as LVs (a tricky migration but it worked!).