latchset / clevis

Automated Encryption Framework
GNU General Public License v3.0
884 stars 99 forks source link

Clevis stubbornly keeps trying the first Tang Server (even when not Ready) for unlocking even though other Servers are Online and available #472

Open luckylinux opened 1 month ago

luckylinux commented 1 month ago

I originally reported the Issue on Ubuntu Launchpad BUG Tracker since that is where I first noticed it.

However, seems to occur on every Platform.

Essentially I have 4 x Tang Servers that register their Key in one LUKS Keyslot respectively (besides the Manual Passphrase/Password).

The issue is, if the First Tang Server (Tang1) is NOT ready (in the specific case it is "Booting" but NOT unlocked - for security reasons the Tang Servers have Full Disk Encryption for / and must be unlocked Manually), then Clevis will keep trying Tang1, despite Tang2/Tang3/Tang4 being available, online and unlocked.

Tang1 is "stuck" with "Please enter the passphrase for xxxxxx" boot Message (typical of LUKS without any Tang/Clevis/other automated unlocking). I just checked and, in that phase, the Tang1 Host Networking is DOWN (Tang1 doesn't even reply to ping / ICMP Packets).

Eventually, Clevis will contact one of the other Servers (after 5 Minutes / 300 Seconds), but shouldn't this be done much faster ?

I would expect Clevis to be smart enough to query Tang2/Tang3/Tang4 or even better, to query Tang Servers in a round-Robin manner ("random"). Basically "Tang 1 Decryption Fails, move on to Tang2". Or, as stated before, do so in a round-Robin/random Query manner.

sarroutbi commented 1 month ago

Hello @luckylinux . Can you please post what kind of configuration are you using? Are you binding the device multiple times or are you using sss pin? Depending on the case, as a quick fix, you might use the other alternative to check if that works as expected

luckylinux commented 1 month ago

Hi @sarroutbi.

Not sure what "configuration" I am using to be Perfectly honest. I followed up a Tang+Clevis tutorial around a Year ago and had been using it like this ever since.

This is a Script I use for my root on LUKS installs, that will install Keys for ALL configured Tang Servers:

#!/bin/bash

# Determine toolpath if not set already
relativepath="../" # Define relative path to go from this script to the root level of the tool
if [[ ! -v toolpath ]]; then scriptpath=$(cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd ); toolpath=$(realpath --canonicalize-missing $scriptpath/$relativepath); fi

# Load configuration
source $toolpath/config.sh

# Setup CLEVIS for automated disk unlocking
add_rfc3442_hook() {
  cat << EOF > /etc/initramfs-tools/hooks/add-rfc3442-dhclient-hook
#!/bin/sh

PREREQ=""

prereqs()
{
        echo "\$PREREQ"
}

case \$1 in
prereqs)
        prereqs
        exit 0
        ;;
esac

if [ ! -x /sbin/dhclient ]; then
        exit 0
fi

. /usr/share/initramfs-tools/scripts/functions
. /usr/share/initramfs-tools/hook-functions

# Source: https://github.com/latchset/clevis/issues/148#issuecomment-882103016
#    and: https://github.com/latchset/clevis/issues/148#issuecomment-882103016
local_libdir="/lib/x86_64-linux-gnu"
local_found="" lib="" f=""
for lib in libnss_files libnss_dns libresolv; do
    local_found=""
    for f in "\${local_libdir}/\${lib}.so".?; do
        [ -e "\${f}" ] || continue
        [ "\${verbose}" = "y" ] && echo "dns: \${lib}: \${f}"
        copy_file library "\${f}"
        local_found="\${f}"
    done
    [ -n "\${local_found}" ] || echo "WARNING: no \${local_libdir}/\${lib}.? file" 1>&2
done

mkdir -p \$DESTDIR/etc/dhcp/dhclient-exit-hooks.d/
cp -a /etc/dhcp/dhclient-exit-hooks.d/rfc3442-classless-routes \$DESTDIR/etc/dhcp/dhclient-exit-hooks.d/
EOF

  chmod +x /etc/initramfs-tools/hooks/add-rfc3442-dhclient-hook
}

# Install hook
add_rfc3442_hook

# Update APT Lists
apt-get update

# Install clevis on the system and add clevis to the initramfs
apt-get install --yes clevis clevis-luks clevis-initramfs cryptsetup-initramfs

# Ask for password
read -s -p "Enter encryption password: " password

# For each keyserver
counter=1
for keyserver in "${keyservers[@]}"
do
     # Get TANG Server Key
     curl -sfg http://$keyserver/adv -o /tmp/keyserver-$counter.jws

     # Check which keys are currently used via CLEVIS
     list_device1=$(clevis luks list -d $device1-part${root_num})
     list_device2=$(clevis luks list -d $device2-part${root_num})

     # Bind device to the TANG server via CLEVIS
     # Device 1
     if [[ "${list_device1}" == *"${keyserver}"* ]]
     then
         echo "Keyserver <$keyserver> is already installed"
     else
         echo "Install Keyserver <$keyserver> onto $device1 LUKS Header"
         echo $password | clevis luks bind -d $device1-part${root_num} tang "{\"url\": \"http://$keyserver\" , \"adv\": \"/tmp/keyserver-$counter.jws\" }"
     fi

     # Device 2
     if [[ "${list_device2}" == *"${keyserver}"* ]]
     then
         echo "Keyserver <$keyserver> is already installed"
     else
          echo "Install Keyserver <$keyserver> onto $device2 LUKS Header"
          echo $password | clevis luks bind -d $device2-part${root_num} tang "{\"url\": \"http://$keyserver\" , \"adv\": \"/tmp/keyserver-$counter.jws\" }"
     fi

     # Increment counter
     counter=$((counter+1))
done

# Clear password from memory
unset $password

# Update initramfs
update-initramfs -c -k all

# Get information
cryptsetup luksDump $device1-part${root_num}
cryptsetup luksDump $device2-part${root_num}
clevis luks list -d $device1-part${root_num}
clevis luks list -d $device2-part${root_num}