Closed wainersm closed 1 year ago
So recently there was a refactored (the flag was separated out into a separated MEASURED_ROOTFS
option, but it shouldn't have changed the code. ~Both Wainer and I have seen that running this locally with non-TEE (e.g. with the ssh-demo) caused issues as well, but I don't think we understood why.~ I'm also puzzled how it passed the tests in the PR, but we had a similar issue with SEV as well 😕
@arronwy - have you got any idea how we can debug this, or check that the merge didn't break something that enforced the integrity check?
@arronwy does the initramfs's init script run on error-exit mode? I am asking because I want to ensure that if https://github.com/kata-containers/kata-containers/blob/CCv0/tools/packaging/static-build/initramfs/init.sh#L35 (veritysetup
) fails then the entire boot will fail. The hypothesis being that if veritysetup
fails unnoticed at that line then the measured rootfs process is simply ignored by the kernel.
@wainersm but if veritysetup
fails in line 35, then it won't create the /dev/mapper/root
device, and then mount /dev/mapper/root /mnt
will fail on line 36, and then switch_root
will fail on line 44 (or, it will succeed but then there's an entirely empty filesystem in /mnt
, so any additional process will fail). Specifically I don't expect the kata-agent to start at all in such a scenario. (But I agree that set -e
at the top is a good practice.)
Looking at the log ( kata-containers-CCv0-ubuntu-20.04-x86_64-CC_CRI_CONTAINERD-baseline/89 ), I see:
...
vmconsole="[ 0.434954] Freeing unused kernel image (initmem) memory: 952K"
vmconsole="[ 0.437958] Write protecting the kernel read-only data: 16384k"
vmconsole="[ 0.442432] Freeing unused kernel image (text/rodata gap) memory: 2044K"
vmconsole="[ 0.442901] Freeing unused kernel image (rodata/data gap) memory: 92K"
vmconsole="[ 0.443086] Run /sbin/init as init process"
vmconsole="[ 0.443216] with arguments:"
vmconsole="[ 0.443279] /sbin/init"
vmconsole="[ 0.443322] with environment:"
vmconsole="[ 0.443383] HOME=/"
vmconsole="[ 0.443426] TERM=linux"
vmconsole="[ 0.498822] systemd[1]: systemd 245.4-4ubuntu3 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)"
...
Two things:
/sbin/init
but our initramfs.list
copies our init.sh
(which is /usr/sbin/init
in the docker image) to /init
in the initramfs image.Maybe we're building one initrd image but starting the VM with another? Or am I looking in the wrong place?
So I think I made enough changes to the latest merge to get this working again: https://github.com/kata-containers/kata-containers/pull/7205 -> http://jenkins.katacontainers.io/job/tests-CCv0-ubuntu-20.04-x86_64-CC_CRI_CONTAINERD_K8S-PR/434/console
I don't really understand why though, but adding support back in for initrd generation with TDX seems to be the notable change? https://github.com/kata-containers/kata-containers/pull/7205/files#diff-c4e252149c7d3f7767fa98bf90ccef0e2cfe8a9a9065f08b1cb181df001ea0b8
@wainersm but if
veritysetup
fails in line 35, then it won't create the/dev/mapper/root
device, and thenmount /dev/mapper/root /mnt
will fail on line 36, and thenswitch_root
will fail on line 44 (or, it will succeed but then there's an entirely empty filesystem in/mnt
, so any additional process will fail). Specifically I don't expect the kata-agent to start at all in such a scenario. (But I agree thatset -e
at the top is a good practice.)
Thanks @dubek ! I didn't know for sure how would be the effect of a verifysetup failure!
Looking at the log ( kata-containers-CCv0-ubuntu-20.04-x86_64-CC_CRI_CONTAINERD-baseline/89 ), I see:
... vmconsole="[ 0.434954] Freeing unused kernel image (initmem) memory: 952K" vmconsole="[ 0.437958] Write protecting the kernel read-only data: 16384k" vmconsole="[ 0.442432] Freeing unused kernel image (text/rodata gap) memory: 2044K" vmconsole="[ 0.442901] Freeing unused kernel image (rodata/data gap) memory: 92K" vmconsole="[ 0.443086] Run /sbin/init as init process" vmconsole="[ 0.443216] with arguments:" vmconsole="[ 0.443279] /sbin/init" vmconsole="[ 0.443322] with environment:" vmconsole="[ 0.443383] HOME=/" vmconsole="[ 0.443426] TERM=linux" vmconsole="[ 0.498822] systemd[1]: systemd 245.4-4ubuntu3 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)" ...
Two things:
1. It starts `/sbin/init` but our `initramfs.list` copies our `init.sh` (which is `/usr/sbin/init` in the docker image) to `/init` in the initramfs image. 2. It starts systemd but we expected it to start our init shell script
Maybe we're building one initrd image but starting the VM with another? Or am I looking in the wrong place?
Might be the case. Perhaps we should have a new function to kata-agent telling it to verify that once the measure boot was activated, the measurement was really performed. Having measurement boot giving "false positive" results is really a critical bug IMHO and we should look for extra checks.
Anyway, I could reproduce the issue on my environment and here goes more information (I haven't interpreted them yet):
ubuntu@ubuntu:~/go/src/github.com/kata-containers/tests$ kata-runtime kata-env --json | jq '.Initrd'
{
"Path": ""
}
ubuntu@ubuntu:~/go/src/github.com/kata-containers/tests$ kata-runtime kata-env --json | jq '.Image'
{
"Path": "/opt/confidential-containers/share/kata-containers/kata-ubuntu-latest.image"
}
ubuntu@ubuntu:~/go/src/github.com/kata-containers/tests$ sudo mount -o loop,offset=$((512*6144)) "/opt/confidential-containers/share/kata-containers/kata-ubuntu-latest.image" /mnt
ubuntu@ubuntu:~/go/src/github.com/kata-containers/tests$ cd /mnt/
ubuntu@ubuntu:/mnt$ sudo find . -name "init"
./lib/init
./sbin/init
ubuntu@ubuntu:/mnt$ file sbin/init
sbin/init: symbolic link to /lib/systemd/systemd
ubuntu@ubuntu:/mnt$ ls -l /lib/systemd/systemd
-rwxr-xr-x 1 root root 1620224 Mar 27 17:54 /lib/systemd/systemd
ubuntu@ubuntu:/mnt$ file /lib/systemd/systemd
/lib/systemd/systemd: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=986b4ab2b0c3882200e781081bddb934ef5cf609, for GNU/Linux 3.2.0, stripped
Here is the file that the test case inserts into the image to simulate an image tampering:
ubuntu@ubuntu:/mnt$ cat etc/aa-offline_fs_kbc-resources.json
{
"Policy": "",
"Sigstore Config": "",
"GPG Keyring": "",
"Cosign Key": "",
"Credential": "ewogICAgImF1dGhzIjogewogICAgICAgICJxdWF5LmlvL2thdGEtY29udGFpbmVycy9jb25maWRlbnRpYWwtY29udGFpbmVycy1hdXRoIjogewogICAgICAgICAgICAiYXV0aCI6ICJRWEpoYm1SdmJYRjFZWGwwWlhOMFlXTmpiM1Z1ZEhSb1lYUmtiMlZ6Ym5SbGVHbHpkRHB3WVhOemQyOXlaQW89IgogICAgICAgIH0KICAgIH0KfQ==",
"default/security-policy/test": "",
"default/sigstore-config/test": "",
"default/gpg-public-config/test": "",
"default/cosign-public-key/test": "",
"default/credential/test": "ewogICAgImF1dGhzIjogewogICAgICAgICJxdWF5LmlvL2thdGEtY29udGFpbmVycy9jb25maWRlbnRpYWwtY29udGFpbmVycy1hdXRoIjogewogICAgICAgICAgICAiYXV0aCI6ICJRWEpoYm1SdmJYRjFZWGwwWlhOMFlXTmpiM1Z1ZEhSb1lYUmtiMlZ6Ym5SbGVHbHpkRHB3WVhOemQyOXlaQW89IgogICAgICAgIH0KICAgIH0KfQ=="
}
I was looking for the dmsetup
binary inside the image but I couldn't find.
I am starting to think it might be a bug on the cache mechanism that for measured boot build involves shim-v2, kernel and rootfs components and is tricky.
The last rootfs-image cache job failed at June 27th (http://jenkins.katacontainers.io/job/kata-containers-2.0-rootfs-image-cc-x86_64/). Next day (June 28th) the baseline builds, for example http://jenkins.katacontainers.io/view/Daily%20CCv0%20baseline/job/kata-containers-CCv0-ubuntu-20.04-x86_64-CC_CRI_CONTAINERD-baseline/, started to fail. Other than that, It seems that the cached kernel was not built with measured boot support.
Let's have a look at the first baseline CC_CRI_CONTAINERD job take failed: http://jenkins.katacontainers.io/view/Daily%20CCv0%20baseline/job/kata-containers-CCv0-ubuntu-20.04-x86_64-CC_CRI_CONTAINERD-baseline/89/consoleFull
It used a cached kernel:
22:02:08 Build kata version 3.2.0-alpha3: cc-kernel
22:02:08 INFO: DESTDIR /tmp/jenkins/workspace/kata-containers-CCv0-ubuntu-20.04-x86_64-CC_CRI_CONTAINERD-baseline/go/src/github.com/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/cc-kernel/destdir
22:02:09 INFO: Using cached tarball of kernel
22:02:09 Downloading tarball from: http://jenkins.katacontainers.io/job/kata-containers-2.0-kernel-cc-x86_64/lastSuccessfulBuild/artifact/artifacts/kata-static-cc-kernel.tar.xz
22:02:09 --2023-06-28 01:02:09-- http://jenkins.katacontainers.io/job/kata-containers-2.0-kernel-cc-x86_64/lastSuccessfulBuild/artifact/artifacts/kata-static-cc-kernel.tar.xz
I don't see on last built kernel (http://jenkins.katacontainers.io/job/kata-containers-2.0-kernel-cc-x86_64/190/console) any message like in https://github.com/kata-containers/kata-containers/blob/CCv0/tools/packaging/kernel/build-kernel.sh#L268
Haha - I just posted a similar hypothesis in slack that the kernel caching might be the problem! I'll dig into it more tomorrow and see if there is anything I can spot, otherwise I might have to get Mr Caching to take a look!
I think I've solved this. The problem was that when the measured rootfs forward port was merged back into CCv0
the variable to trigger it was changed to MEASURED_ROOTFS
, rather than just being triggered but KATA_BUILD_CC
. I remembered to make these changes in the ci test jobs, but not the cache job, so the cc-kernel cache jobs were being built without the initramfs. This explains why the initial PR tests worked as there was a kernel config bump, so they cache wasn't used, but the daily baseline doesn't have these.
I've deleted the last two builds of the cache job and re-run it and the kernel.tar.xz is back up to 18.70MB as it was two weeks ago, rather than the 12MB of the previous job, so I think we should be back working in tomorrow's baseline. I'm re-running the main->CCv0 test job (https://github.com/kata-containers/kata-containers/pull/7226) now to check before then
Fixed.
Some of the baseline jobs that nighty run to test the CCv0 branch have failed in the
Test cannot launch pod with measured boot enabled and rootfs modified
. The test changes the kata's guest image so that it expect the pod not be launched when the measured boot is enable, it happens that the pod got launched.The jobs that failed (so far):
Some observations:
cc_rootfs_verity.scheme=dm-verity cc_rootfs_verity.hash=64f8f95e1a066b643f91db2e7ed0efadd43d451847dd138d9e4b9209a97d69d4
)