Open ahrkrak opened 2 years ago
I see this as blocked on using systemd-boot https://github.com/flatcar/Flatcar/issues/867 because of this whole signed PCR policy topic which is needed for unlocking after OS updates.
Any idea when are we planning to release this bug? This is important from a security point of view.
I agree and we're always looking for volunteers that have the time and expertise to help out on topics such as this one. This issue involves changes to many parts of the OS.
A first version of this should use the TPM as simple key store, without a protection. It has some security benefits over storing the key to hard disk thus prevents a "hard disk copy attack" but doesn't offer protection against online system compromise where root or an alternatively booted OS can kindly ask the TPM to give the key. For us, implementing this is easier because we don't have to deal with unstable PCR measurements, but can focus on getting the unlocking and enrollment story right. The final version could be using GRUB and some tricks to deal with unpredictable PCR values (e.g., disable auto-updates or have a way to out-of-band check the expected values for a certain Flatcar version with Tang or something) but actually, it makes way more sense to use systemd-boot to get a reliable behavior with signed PCR policies that work well when PCR values change and can be precalculated at our side. Then we can enroll the key to the TPM and bind it to the signature being valid. To get there we also have to address some other problems on the way, such as defining a set of command line parameters that are expected and considered secure (systemd tools get actively developed and hopefully should already be mature at the point when we want to use them).
Our internal appsec team has flagged the lack of LUKS2 support. Can this be escalated please @t-lo ?
I wrote some assessment of the current state in https://github.com/flatcar/Flatcar/issues/590#issuecomment-1966647294 What we need is probably docs to explain how these systemd features should be used.
Currently for TPMs that means either not binding to PCRs or disabling auto-updates. One could also try to make use of systemd-pcrlock together with GRUB with some update-integration to loosen security for the update reboot.
The next Alpha will be able to set up a TPM-backed rootfs with clevis or systemd-cryptenroll. The latest PR that needs to land is https://github.com/flatcar/bootengine/pull/93 for systemd-cryptenroll.
There are no docs yet except for the example in the linked PR and the one below. Also for clevis one needs to write a plain Ignition JSON file and for systemd-cryptenroll one needs to add a helper service in the Butane YAML/Ignition JSON config.
Clevis example:
{
"ignition": {
"version": "3.4.0"
},
"storage": {
"filesystems": [
{
"device": "/dev/mapper/rootencrypted",
"format": "ext4",
"label": "ROOT"
}
],
"luks": [
{
"clevis": {
"tpm2": true
},
"device": "/dev/disk/by-partlabel/ROOT",
"name": "rootencrypted",
"wipeVolume": true
}
]
}
}
Hello, I tried the following workflow, but at the next reboot I get a hanging systemd dev by label \ ROOT.device start
:
#!/bin/bash
set -xe
# This script was tested on Ubuntu 22.04
FLATCAR_IMAGE="https://bincache.flatcar-linux.net/images/amd64/3922.0.0+nightly-20240327-2100/flatcar_production_qemu_image.img.bz2"
wget $FLATCAR_IMAGE
bzip2 -k -d flatcar_production_qemu_image.img.bz2
# install tpm tooling
sudo apt install swtpm swtpm-tools -y
# /run/swtpm/ is the only directory allowed by apparmor
SOCKET_DIR=/run/swtpm/
STATE_DIR=/tmp/mytpm1
mkdir -p "$SOCKET_DIR"
mkdir -p "$STATE_DIR"
SOCK=$SOCKET_DIR/sock
sudo swtpm_setup --tpmstate $STATE_DIR \
--create-ek-cert --create-platform-cert --lock-nvram --tpm2
sudo swtpm socket --tpmstate "dir=$STATE_DIR" --ctrl "type=unixio,path=$SOCK" --tpm2 &
cat > ignition-root-tpm.yaml <<EOF
variant: flatcar
version: 1.1.0
storage:
luks:
- name: rootencrypted
wipe_volume: true
device: "/dev/disk/by-partlabel/ROOT"
filesystems:
- device: /dev/mapper/rootencrypted
format: ext4
label: ROOT
systemd:
units:
- name: cryptenroll-helper.service
enabled: true
contents: |
[Unit]
ConditionFirstBoot=true
OnFailure=emergency.target
OnFailureJobMode=isolate
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=systemd-cryptenroll --tpm2-device=auto --unlock-key-file=/etc/luks/rootencrypted --wipe-slot=0 /dev/disk/by-partlabel/ROOT
ExecStart=rm /etc/luks/rootencrypted
[Install]
WantedBy=multi-user.target
EOF
docker run --interactive --rm quay.io/coreos/butane:release \
--pretty --strict < ignition-root-tpm.yaml > ignition-root-tpm.json
sudo bash ./flatcar_production_qemu.sh -i ignition-root-tpm.json \
-- -chardev "socket,id=chrtpm,path=$SOCK" -tpmdev emulator,id=tpm0,chardev=chrtpm \
-device tpm-tis,tpmdev=tpm0 -nographic -vnc :1
# output of journalctl
# test systemd-cryptenroll[2481]: New TPM2 token enrolled as key slot 1.
# performed a reboot
# ROOT device job is hanging
# Do I need to do something more?
Hello, I tried the following workflow, but at the next reboot I get a hanging systemd
dev by label \ ROOT.device start
:#!/bin/bash set -xe # This script was tested on Ubuntu 22.04 FLATCAR_IMAGE="https://bincache.flatcar-linux.net/images/amd64/3922.0.0+nightly-20240327-2100/flatcar_production_qemu_image.img.bz2" wget $FLATCAR_IMAGE bzip2 -k -d flatcar_production_qemu_image.img.bz2 # install tpm tooling sudo apt install swtpm swtpm-tools -y # /run/swtpm/ is the only directory allowed by apparmor SOCKET_DIR=/run/swtpm/ STATE_DIR=/tmp/mytpm1 mkdir -p "$SOCKET_DIR" mkdir -p "$STATE_DIR" SOCK=$SOCKET_DIR/sock sudo swtpm_setup --tpmstate $STATE_DIR \ --create-ek-cert --create-platform-cert --lock-nvram sudo swtpm socket --tpmstate "dir=$STATE_DIR" --ctrl "type=unixio,path=$SOCK" --tpm2 & cat > ignition-root-tpm.yaml <<EOF variant: flatcar version: 1.1.0 storage: luks: - name: rootencrypted wipe_volume: true device: "/dev/disk/by-partlabel/ROOT" filesystems: - device: /dev/mapper/rootencrypted format: ext4 label: ROOT systemd: units: - name: cryptenroll-helper.service enabled: true contents: | [Unit] ConditionFirstBoot=true OnFailure=emergency.target OnFailureJobMode=isolate [Service] Type=oneshot RemainAfterExit=yes ExecStart=systemd-cryptenroll --tpm2-device=auto --unlock-key-file=/etc/luks/rootencrypted --wipe-slot=0 /dev/disk/by-partlabel/ROOT ExecStart=rm /etc/luks/rootencrypted [Install] WantedBy=multi-user.target EOF docker run --interactive --rm quay.io/coreos/butane:release \ --pretty --strict < ignition-root-tpm.yaml > ignition-root-tpm.json sudo bash ./flatcar_production_qemu.sh -i ignition-root-tpm.json \ -- -chardev "socket,id=chrtpm,path=$SOCK" -tpmdev emulator,id=tpm0,chardev=chrtpm \ -device tpm-tis,tpmdev=tpm0 -nographic -vnc :1 # output of journalctl # test systemd-cryptenroll[2481]: New TPM2 token enrolled as key slot 1. # performed a reboot # ROOT device job is hanging # Do I need to do something more?
The new nightly should work, as it needs this PR https://github.com/flatcar/scripts/pull/1807
Yes, it needs the PR that was just merged. Here my current steps extracted from the mantle kola setup.
# Tested with Fedora and permissive selinux, should work with either clevis or cryptenroll
DIR=/var/tmp/swtpm-dir # Note: delete the contents when setting up a new VM
mkdir -p "$DIR"
SOCK=/var/tmp/swtpm-sock
swtpm socket --tpmstate "dir=$DIR" --ctrl "type=unixio,path=$SOCK" --tpm2 &
./flatcar_production_qemu.sh -i ignition-cryptenroll-root-tpm.json -- -chardev "socket,id=chrtpm,path=$SOCK" -tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0
Disk encryption works in the Flatcar Alpha release with some limitations due to GRUB measures the OS state and how it changes after first boot. But one can already try it out as mentioned in the docs PR above and it gives some security benefits. We hope to improve this in Flatcar and also address some holes in the design.
first_boot
file. This could be moved to the initrd. To prevent that one can trigger Ignition again with different userdata we have to measure the previously applied (untrusted) userdata when Ignition won't run (if it runs, it should measure the now applied userdata). This way one can trick the system into running Ignition again but only with the original userdata which normally would set up disk encryption and thus the data is lost or if it isn't then the unlocking won't work during the initrd but only afterwards and since there was no userdata change possible, there shouldn't be a malicious effect.
Current situation
Flatcar doesn't currently support full-disk encryption AFAIK.
Impact
1) Without disk encryption Flatcar systems might not pass some security audit / certification requirements. 2) Potential attack vector if someone can get hold of the physical disk inside a Flatcar machine.
Ideal future situation
Disks are encrypted, leveraging the TPM as a secret store to bind to a specific machine. We can also make it so the machine must boot up with the installed flatcar in order to access the encrypted disk (eliminating the attack vector of booting into a different OS on the machine in order to unlock the disk).
Implementation options
From systemd 248, systemd-cryptsetup supports three hardware encryption techniques including TPM based. See https://0pointer.net/blog/unlocking-luks2-volumes-with-tpm2-fido2-pkcs11-security-hardware-on-systemd-248.html for details.
Additional information
We should also consider removable media. While not an issue for most servers, still there is the potential attack vector of inserting a USB drive to exfiltrate data, which could be mitigated by making that drive only work when inserted into the server.