kairos-io / kairos

:penguin: The immutable Linux meta-distribution for edge Kubernetes.
https://kairos.io
Apache License 2.0
1.13k stars 97 forks source link

Cannot perform trusted boot upgrade #2790

Closed nicolaspernoud closed 3 months ago

nicolaspernoud commented 3 months ago

Kairos version:

PRETTY_NAME="Ubuntu 24.04 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
KAIROS_VERSION="v3.1.1-25-g2a7090f"
KAIROS_ID_LIKE="kairos-core-ubuntu-24.04"
KAIROS_VARIANT="core"
KAIROS_SOFTWARE_VERSION_PREFIX="k3s"
KAIROS_PRETTY_NAME="kairos-core-ubuntu-24.04 v3.1.1-25-g2a7090f"
KAIROS_IMAGE_REPO="quay.io/kairos/ubuntu:24.04-core-amd64-generic-v3.1.1-25-g2a7090f"
KAIROS_ARTIFACT="kairos-ubuntu-24.04-core-amd64-generic-v3.1.1-25-g2a7090f"
KAIROS_FLAVOR_RELEASE="24.04"
KAIROS_FAMILY="ubuntu"
KAIROS_TARGETARCH="amd64"
KAIROS_RELEASE="v3.1.1-25-g2a7090f"
KAIROS_REGISTRY_AND_ORG="quay.io/kairos"
KAIROS_GITHUB_REPO="kairos-io/kairos"
KAIROS_IMAGE_LABEL="24.04-core-amd64-generic-v3.1.1-25-g2a7090f"
KAIROS_HOME_URL="https://github.com/kairos-io/kairos"
KAIROS_ID="kairos"
KAIROS_NAME="kairos-core-ubuntu-24.04"
KAIROS_VERSION_ID="v3.1.1-25-g2a7090f"
KAIROS_FLAVOR="ubuntu"
KAIROS_MODEL="generic"
KAIROS_BUG_REPORT_URL="https://github.com/kairos-io/kairos/issues"

CPU architecture, OS, and Version: Linux localhost 6.8.0-39-generic #39-Ubuntu SMP PREEMPT_DYNAMIC Fri Jul 5 21:49:14 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Describe the bug Building an upgrade image with the same base image and the same keys as the first image does not work.

To Reproduce First, we create an image, a VM and deploy Kairos on it :

#!/bin/bash
set -Eeuxo pipefail
WD="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd ${WD}

COMPANY_NAME="acme"
ISO_NAME="${COMPANY_NAME}-os"
KAIROS_USER=kairos

# Step 0: reset
sudo rm -rf build/ files-iso/ keys/ id_rsa* *.fd *.img Dockerfile*

# Step 1: create the Dockerfile
if [ ! -e "id_rsa" ]; then
  ssh-keygen -t rsa -b 4096 -f ./id_rsa
fi

cat <<EOF >./Dockerfile
FROM quay.io/kairos/ubuntu:24.04-core-amd64-generic-master
# Customizations
RUN echo "test" > /etc/test.txt
EOF

# Step 2: create the cloud init file
mkdir -p ./files-iso
cat <<EOF >./files-iso/cloud_init.yaml
#cloud-config

install:
  reboot: true
  poweroff: false
  auto: true # Required, for automated installations
  bind_mounts:
    - /var/lib/${COMPANY_NAME}

users:
  - name: ${KAIROS_USER}
    passwd: ${KAIROS_USER}
    sudo: ALL=(ALL) NOPASSWD:ALL
    ssh_authorized_keys:
      - $(cat id_rsa.pub)

write_files:
  - path: /var/log/${COMPANY_NAME}.log
    content: |
      # ${COMPANY_NAME^} cloud init done
EOF

# Step 3: generate the keys
if [ ! -e "./keys" ]; then
  mkdir -p ./keys
  docker run -v $PWD/keys:/work/keys -ti --rm quay.io/kairos/osbuilder-tools:latest genkey "${COMPANY_NAME^}" --skip-microsoft-certs-I-KNOW-WHAT-IM-DOING --expiration-in-days 365 -o /work/keys
fi

# Step 4: Build installable medium with keys
if [ ! -e "./build/${ISO_NAME}.iso" ]; then
  IMAGE=${ISO_NAME}:latest
  docker build --tag $IMAGE .
  docker run \
    -ti \
    --rm \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v $PWD/build:/result \
    -v $PWD/keys/:/keys \
    -v $PWD/files-iso:/files-iso \
    quay.io/kairos/osbuilder-tools:v0.300.3 \
    build-uki $IMAGE \
    --name "${ISO_NAME}" \
    --overlay-iso /files-iso \
    --boot-branding "${COMPANY_NAME^} OS" \
    -t iso \
    -d /result/ \
    -k /keys
  sudo chown -Rf $USER:$USER ./build
  sudo chmod -Rf 777 ./build
fi

# Step 5: QEMU Test
MACHINE_NAME="test"
QEMU_IMG="${MACHINE_NAME}.img"
SSH_PORT="2222"
OVMF_CODE="/usr/share/OVMF/OVMF_CODE_4M.ms.fd"
OVMF_VARS_ORIG="/usr/share/OVMF/OVMF_VARS_4M.ms.fd"
OVMF_VARS="$(basename "${OVMF_VARS_ORIG}")"

if [ ! -e "${QEMU_IMG}" ]; then
  qemu-img create -f qcow2 "${QEMU_IMG}" 40G
fi

if [ ! -e "${OVMF_VARS}" ]; then
  cp "${OVMF_VARS_ORIG}" "${OVMF_VARS}"
fi

# TPM Emulator
mkdir -p /tmp/mytpm1
swtpm socket --tpmstate dir=/tmp/mytpm1 \
  --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock \
  --tpm2 \
  --log level=20 &
TPM_PID=$!

# Start VM
qemu-system-x86_64 \
  -enable-kvm \
  -cpu host -smp cores=4,threads=1 -m 4096 \
  -object rng-random,filename=/dev/urandom,id=rng0 \
  -device virtio-rng-pci,rng=rng0 \
  -name "${MACHINE_NAME}" \
  -drive file="${QEMU_IMG}",format=qcow2 \
  -net nic,model=virtio -net user,hostfwd=tcp::${SSH_PORT}-:22 \
  -vga virtio \
  -machine q35,smm=on \
  -global driver=cfi.pflash01,property=secure,value=on \
  -drive if=pflash,format=raw,unit=0,file="${OVMF_CODE}",readonly=on \
  -drive if=pflash,format=raw,unit=1,file="${OVMF_VARS}" \
  -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
  -tpmdev emulator,id=tpm0,chardev=chrtpm \
  -device tpm-tis,tpmdev=tpm0 \
  -cdrom ./build/${ISO_NAME}.iso -boot menu=on,splash-time=10000 -monitor stdio

kill $TPM_PID

Then we try to make an upgrade image and deploy it with ssh :

#!/bin/bash
set -Eeuxo pipefail
WD="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd ${WD}

COMPANY_NAME="acme"
CONTAINER_IMAGE=${COMPANY_NAME}-os-v2
KAIROS_IP=$(ip -f inet addr show virbr0 | sed -En -e 's/.*inet ([0-9.]+).*/\1/p')
HOST_IP=$(ip -f inet addr show enp0s31f6 | sed -En -e 's/.*inet ([0-9.]+).*/\1/p')

# Step 1: create the updated Dockerfile
cat <<EOF >./Dockerfile.update
FROM quay.io/kairos/ubuntu:24.04-core-amd64-generic-master
# Customizations
RUN echo "test v2" > /etc/test.txt
EOF

# Build the container image that will be used to generate the keys and installable medium
git clone https://github.com/kairos-io/enki.git
cd enki
docker build -t enki --target tools-image .
cd ${WD}

# Step 2: build the updated Container
# docker build . -t ${HOST_IP}:5000/${CONTAINER_IMAGE} -f Dockerfile.update
docker build . -t ${CONTAINER_IMAGE} -f Dockerfile.update

docker run \
    -ti \
    --rm \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v $PWD/keys:/keys \
    -v $PWD/build:/work \
    enki \
    build-uki $CONTAINER_IMAGE \
    -t uki \
    -d /work/upgrade-image \
    -k /keys

docker run \
    -ti \
    --rm \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v $PWD/keys:/keys \
    -v $PWD/build:/work \
    enki \
    build-uki $CONTAINER_IMAGE \
    -t container \
    -d /work/upgrade-image \
    -k /keys

docker load -i build/upgrade-image/*.tar

# Step 3: start a local registry
docker stop registry
docker rm registry
docker run -d -p 5000:5000 --name registry registry:2.8.3
sleep 5

# Step 4: push the image to the registry
docker push localhost:5000/${CONTAINER_IMAGE}

# Step 5: ssh to kairos and launch update
ssh-keygen -f "/home/$USER/.ssh/known_hosts" -R "[$KAIROS_IP]:2222"
ssh -o "IdentitiesOnly=yes" -i ./id_rsa kairos@$KAIROS_IP -p 2222 "sudo kairos-agent upgrade --source oci:${HOST_IP}:5000/${CONTAINER_IMAGE}:latest"

Expected behavior The image should update...

Logs

2024-08-05T08:26:02Z INF Kairos Agent version=v2.13.1
2024-08-05T08:26:02Z INF Kairos System version=v3.1.1-25-g2a7090f
2024-08-05T08:26:02Z INF creating a runtime
2024-08-05T08:26:02Z INF detecting boot state
2024-08-05T08:26:02Z INF Boot Mode boot_mode=active_boot
2024-08-05T08:26:02Z INF Boot in uki mode result=true
2024-08-05T08:26:02Z INF Checking if OCI image 192.168.1.69:5000/stormshield-os-v2:latest exists
2024-08-05T08:26:02Z INF Setting image size to 2985Mb
2024-08-05T08:26:02Z INF Running stage: kairos-uki-upgrade.pre.before

2024-08-05T08:26:02Z INF Done executing stage 'kairos-uki-upgrade.pre.before'

2024-08-05T08:26:02Z INF Running stage: kairos-uki-upgrade.pre

2024-08-05T08:26:02Z INF Done executing stage 'kairos-uki-upgrade.pre'

2024-08-05T08:26:02Z INF Running stage: kairos-uki-upgrade.pre.after

2024-08-05T08:26:02Z INF Done executing stage 'kairos-uki-upgrade.pre.after'

2024-08-05T08:26:02Z INF Running stage: kairos-uki-upgrade.pre.before

2024-08-05T08:26:02Z INF Done executing stage 'kairos-uki-upgrade.pre.before'

2024-08-05T08:26:02Z INF Running stage: kairos-uki-upgrade.pre

2024-08-05T08:26:02Z INF Done executing stage 'kairos-uki-upgrade.pre'

2024-08-05T08:26:02Z INF Running stage: kairos-uki-upgrade.pre.after

2024-08-05T08:26:02Z INF Done executing stage 'kairos-uki-upgrade.pre.after'

2024-08-05T08:26:02Z INF Copying 192.168.1.69:5000/stormshield-os-v2:latest source to /efi
2024-08-05T08:26:02Z ERR dumping the source: symlink /dev/null /efi/etc/systemd/system/systemd-pcrlock-make-policy.service: operation not permitted
1 error occurred:
    * symlink /dev/null /efi/etc/systemd/system/systemd-pcrlock-make-policy.service: operation not permitted

Additional context The bash snippet provided should be self sufficients to perform the test. What is the purpose of building an uki before the container image in the upgrade process ?

mudler commented 3 months ago

@nicolaspernoud while running docker run \ -ti \ --rm \ -v /var/run/docker.sock:/var/run/docker.sock \ -v $PWD/keys:/keys \ -v $PWD/build:/work \ enki \ build-uki $CONTAINER_IMAGE \ -t container \ -d /work/upgrade-image \ -k /keys

It generates a container image which you later import with: docker load -i build/upgrade-image/*.tar

That's the image that needs to be used for the upgrades - not $CONTAINER_IMAGE.

The docs needs improvements here, it's not really clear indeed by looking at https://kairos.io/docs/upgrade/trustedboot/

nicolaspernoud commented 3 months ago

Thanks, that did work after tagging the image. It is true that the documentation is not so clear.

nicolaspernoud commented 3 months ago

@mudler : by the way, building enki from source and building an uki before building the container image seems useless. Here is a minimal working process : https://github.com/nicolaspernoud/kairos-assessment Precisely : https://github.com/nicolaspernoud/kairos-assessment/blob/main/02_build_update_image_and_deploy_with_ssh.sh