flatcar / Flatcar

Flatcar project repository for issue tracking, project documentation, etc.
https://www.flatcar.org/
Apache License 2.0
727 stars 31 forks source link

[RFE] full-disk encryption leveraging systemd-cryptsetup and TPM #593

Open ahrkrak opened 2 years ago

ahrkrak commented 2 years ago

Current situation

Flatcar doesn't currently support full-disk encryption AFAIK.

Impact

1) Without disk encryption Flatcar systems might not pass some security audit / certification requirements. 2) Potential attack vector if someone can get hold of the physical disk inside a Flatcar machine.

Ideal future situation

Disks are encrypted, leveraging the TPM as a secret store to bind to a specific machine. We can also make it so the machine must boot up with the installed flatcar in order to access the encrypted disk (eliminating the attack vector of booting into a different OS on the machine in order to unlock the disk).

Implementation options

From systemd 248, systemd-cryptsetup supports three hardware encryption techniques including TPM based. See https://0pointer.net/blog/unlocking-luks2-volumes-with-tpm2-fido2-pkcs11-security-hardware-on-systemd-248.html for details.

Additional information

We should also consider removable media. While not an issue for most servers, still there is the potential attack vector of inserting a USB drive to exfiltrate data, which could be mitigated by making that drive only work when inserted into the server.

pothos commented 1 year ago

I see this as blocked on using systemd-boot https://github.com/flatcar/Flatcar/issues/867 because of this whole signed PCR policy topic which is needed for unlocking after OS updates.

nikhildeshpande commented 1 year ago

Any idea when are we planning to release this bug? This is important from a security point of view.

jepio commented 1 year ago

I agree and we're always looking for volunteers that have the time and expertise to help out on topics such as this one. This issue involves changes to many parts of the OS.

pothos commented 1 year ago

A first version of this should use the TPM as simple key store, without a protection. It has some security benefits over storing the key to hard disk thus prevents a "hard disk copy attack" but doesn't offer protection against online system compromise where root or an alternatively booted OS can kindly ask the TPM to give the key. For us, implementing this is easier because we don't have to deal with unstable PCR measurements, but can focus on getting the unlocking and enrollment story right. The final version could be using GRUB and some tricks to deal with unpredictable PCR values (e.g., disable auto-updates or have a way to out-of-band check the expected values for a certain Flatcar version with Tang or something) but actually, it makes way more sense to use systemd-boot to get a reliable behavior with signed PCR policies that work well when PCR values change and can be precalculated at our side. Then we can enroll the key to the TPM and bind it to the signature being valid. To get there we also have to address some other problems on the way, such as defining a set of command line parameters that are expected and considered secure (systemd tools get actively developed and hopefully should already be mature at the point when we want to use them).

jrose-qualys commented 8 months ago

Our internal appsec team has flagged the lack of LUKS2 support. Can this be escalated please @t-lo ?

pothos commented 8 months ago

I wrote some assessment of the current state in https://github.com/flatcar/Flatcar/issues/590#issuecomment-1966647294 What we need is probably docs to explain how these systemd features should be used.

Currently for TPMs that means either not binding to PCRs or disabling auto-updates. One could also try to make use of systemd-pcrlock together with GRUB with some update-integration to loosen security for the update reboot.

pothos commented 7 months ago

The next Alpha will be able to set up a TPM-backed rootfs with clevis or systemd-cryptenroll. The latest PR that needs to land is https://github.com/flatcar/bootengine/pull/93 for systemd-cryptenroll.

There are no docs yet except for the example in the linked PR and the one below. Also for clevis one needs to write a plain Ignition JSON file and for systemd-cryptenroll one needs to add a helper service in the Butane YAML/Ignition JSON config.

Clevis example:

{
  "ignition": {
    "version": "3.4.0"
  },
  "storage": {
    "filesystems": [
      {
        "device": "/dev/mapper/rootencrypted",
        "format": "ext4",
        "label": "ROOT"
      }
    ],
    "luks": [
      {
        "clevis": {
          "tpm2": true
        },
        "device": "/dev/disk/by-partlabel/ROOT",
        "name": "rootencrypted",
        "wipeVolume": true
      }
    ]
  }
}
ader1990 commented 7 months ago

Hello, I tried the following workflow, but at the next reboot I get a hanging systemd dev by label \ ROOT.device start:

#!/bin/bash
set -xe

# This script was tested on Ubuntu 22.04

FLATCAR_IMAGE="https://bincache.flatcar-linux.net/images/amd64/3922.0.0+nightly-20240327-2100/flatcar_production_qemu_image.img.bz2"
wget $FLATCAR_IMAGE
bzip2 -k -d flatcar_production_qemu_image.img.bz2

# install tpm tooling
sudo apt install swtpm swtpm-tools -y

# /run/swtpm/ is the only directory allowed by apparmor
SOCKET_DIR=/run/swtpm/
STATE_DIR=/tmp/mytpm1

mkdir -p "$SOCKET_DIR"
mkdir -p "$STATE_DIR"

SOCK=$SOCKET_DIR/sock

sudo swtpm_setup --tpmstate $STATE_DIR \
    --create-ek-cert --create-platform-cert --lock-nvram --tpm2
sudo swtpm socket --tpmstate "dir=$STATE_DIR" --ctrl "type=unixio,path=$SOCK" --tpm2 &

cat > ignition-root-tpm.yaml <<EOF
variant: flatcar
version: 1.1.0
storage:
  luks:
  - name: rootencrypted
    wipe_volume: true
    device: "/dev/disk/by-partlabel/ROOT"
  filesystems:
    - device: /dev/mapper/rootencrypted
      format: ext4
      label: ROOT
systemd:
  units:
    - name: cryptenroll-helper.service
      enabled: true
      contents: |
        [Unit]
        ConditionFirstBoot=true
        OnFailure=emergency.target
        OnFailureJobMode=isolate
        [Service]
        Type=oneshot
        RemainAfterExit=yes
        ExecStart=systemd-cryptenroll --tpm2-device=auto --unlock-key-file=/etc/luks/rootencrypted --wipe-slot=0 /dev/disk/by-partlabel/ROOT
        ExecStart=rm /etc/luks/rootencrypted
        [Install]
        WantedBy=multi-user.target
EOF

docker run --interactive --rm quay.io/coreos/butane:release \
       --pretty --strict < ignition-root-tpm.yaml > ignition-root-tpm.json

sudo bash ./flatcar_production_qemu.sh -i ignition-root-tpm.json \
    -- -chardev "socket,id=chrtpm,path=$SOCK" -tpmdev emulator,id=tpm0,chardev=chrtpm \
    -device tpm-tis,tpmdev=tpm0 -nographic -vnc :1

# output of journalctl
#  test systemd-cryptenroll[2481]: New TPM2 token enrolled as key slot 1.

# performed a reboot
# ROOT device job is hanging
# Do I need to do something more?
ader1990 commented 7 months ago

Hello, I tried the following workflow, but at the next reboot I get a hanging systemd dev by label \ ROOT.device start:

#!/bin/bash
set -xe

# This script was tested on Ubuntu 22.04

FLATCAR_IMAGE="https://bincache.flatcar-linux.net/images/amd64/3922.0.0+nightly-20240327-2100/flatcar_production_qemu_image.img.bz2"
wget $FLATCAR_IMAGE
bzip2 -k -d flatcar_production_qemu_image.img.bz2

# install tpm tooling
sudo apt install swtpm swtpm-tools -y

# /run/swtpm/ is the only directory allowed by apparmor
SOCKET_DIR=/run/swtpm/
STATE_DIR=/tmp/mytpm1

mkdir -p "$SOCKET_DIR"
mkdir -p "$STATE_DIR"

SOCK=$SOCKET_DIR/sock

sudo swtpm_setup --tpmstate $STATE_DIR \
    --create-ek-cert --create-platform-cert --lock-nvram
sudo swtpm socket --tpmstate "dir=$STATE_DIR" --ctrl "type=unixio,path=$SOCK" --tpm2 &

cat > ignition-root-tpm.yaml <<EOF
variant: flatcar
version: 1.1.0
storage:
  luks:
  - name: rootencrypted
    wipe_volume: true
    device: "/dev/disk/by-partlabel/ROOT"
  filesystems:
    - device: /dev/mapper/rootencrypted
      format: ext4
      label: ROOT
systemd:
  units:
    - name: cryptenroll-helper.service
      enabled: true
      contents: |
        [Unit]
        ConditionFirstBoot=true
        OnFailure=emergency.target
        OnFailureJobMode=isolate
        [Service]
        Type=oneshot
        RemainAfterExit=yes
        ExecStart=systemd-cryptenroll --tpm2-device=auto --unlock-key-file=/etc/luks/rootencrypted --wipe-slot=0 /dev/disk/by-partlabel/ROOT
        ExecStart=rm /etc/luks/rootencrypted
        [Install]
        WantedBy=multi-user.target
EOF

docker run --interactive --rm quay.io/coreos/butane:release \
       --pretty --strict < ignition-root-tpm.yaml > ignition-root-tpm.json

sudo bash ./flatcar_production_qemu.sh -i ignition-root-tpm.json \
    -- -chardev "socket,id=chrtpm,path=$SOCK" -tpmdev emulator,id=tpm0,chardev=chrtpm \
    -device tpm-tis,tpmdev=tpm0 -nographic -vnc :1

# output of journalctl
#  test systemd-cryptenroll[2481]: New TPM2 token enrolled as key slot 1.

# performed a reboot
# ROOT device job is hanging
# Do I need to do something more?

The new nightly should work, as it needs this PR https://github.com/flatcar/scripts/pull/1807

pothos commented 7 months ago

Yes, it needs the PR that was just merged. Here my current steps extracted from the mantle kola setup.

# Tested with Fedora and permissive selinux, should work with either clevis or cryptenroll
DIR=/var/tmp/swtpm-dir # Note: delete the contents when setting up a new VM
mkdir -p "$DIR"
SOCK=/var/tmp/swtpm-sock
swtpm socket --tpmstate "dir=$DIR" --ctrl "type=unixio,path=$SOCK" --tpm2 &
./flatcar_production_qemu.sh -i ignition-cryptenroll-root-tpm.json -- -chardev "socket,id=chrtpm,path=$SOCK" -tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0
pothos commented 6 months ago

Docs PR: https://github.com/flatcar/flatcar-website/pull/317

pothos commented 6 months ago

Disk encryption works in the Flatcar Alpha release with some limitations due to GRUB measures the OS state and how it changes after first boot. But one can already try it out as mentioned in the docs PR above and it gives some security benefits. We hope to improve this in Flatcar and also address some holes in the design.