Closed stevenhorsman closed 4 months ago
After a lot of compilation issues, I've got it all compiling not and built a CAA OCI image from it, but when testing it doesn't work:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 35m default-scheduler Successfully assigned default/alpine to sh-libvirt-s390x-e2e-22-04-test-4
Warning FailedCreatePodSandBox 7s (x160 over 35m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: remote hypervisor call failed: ttrpc: closed: unknown
so I've clearly broken something during the change
Beraldo has recommended re-generating the hypervisor protos with ttrpc rather than grpc, so I've created a branch that has that change in kata-runtime: https://github.com/kata-containers/kata-containers/compare/main...stevenhorsman:hypervisor-ttrpc?expand=1 I'm not reworking the changes to undo some of these related to the ttrpc -> grpc change.
I've updated my branch to use my fork of kata runtime with the ttrpc changes and I think I get further now. as when I try and start a peer pods the error is:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m20s default-scheduler Successfully assigned default/nginx-secret-pod to peer-pods-worker-0
Warning FailedCreatePodSandBox 114s (x26 over 7m20s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd container: create prepare snapshot dir: failed to create temp dir: stat /var/lib/containerd-nydus/snapshots: no such file or directory: unknown
We've been tracking the status on this in a slack thread. A rough summary is:
2023/11/29 13:48:42 [adaptor/proxy] CreateContainer: containerID:18a7d2fd310f9d92b2ebf495f50bd5f87a3d8a40bd048432493742f456325804
2023/11/29 13:48:42 [adaptor/proxy] mounts:
2023/11/29 13:48:42 [adaptor/proxy] destination:/proc source:proc type:proc
2023/11/29 13:48:42 [adaptor/proxy] destination:/dev source:tmpfs type:tmpfs
2023/11/29 13:48:42 [adaptor/proxy] destination:/dev/pts source:devpts type:devpts
2023/11/29 13:48:42 [adaptor/proxy] destination:/dev/mqueue source:mqueue type:mqueue
2023/11/29 13:48:42 [adaptor/proxy] destination:/sys source:sysfs type:sysfs
2023/11/29 13:48:42 [adaptor/proxy] destination:/dev/shm source:/run/kata-containers/sandbox/shm type:bind
2023/11/29 13:48:42 [adaptor/proxy] destination:/etc/resolv.conf source:/run/kata-containers/shared/containers/18a7d2fd310f9d92b2ebf495f50bd5f87a3d8a40bd048432493742f456325804-5c2378f47edc3280-resolv.conf type:bind
2023/11/29 13:48:42 [adaptor/proxy] annotations:
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-cpu-shares: 2
2023/11/29 13:48:42 [adaptor/proxy] io.katacontainers.pkg.oci.bundle_path: /run/containerd/io.containerd.runtime.v2.task/k8s.io/18a7d2fd310f9d92b2ebf495f50bd5f87a3d8a40bd048432493742f456325804
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-cpu-quota: 0
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-namespace: default
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.container-type: sandbox
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-memory: 0
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-name: nginx
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-uid: 77aae7ac-3160-4e79-9381-74931196d7b1
2023/11/29 13:48:42 [adaptor/proxy] nerdctl/network-namespace: /var/run/netns/cni-459f5739-6f42-6c74-6592-c33541e1cfd4
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-cpu-period: 100000
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-log-directory: /var/log/pods/default_nginx_77aae7ac-3160-4e79-9381-74931196d7b1
2023/11/29 13:48:42 [adaptor/proxy] io.kubernetes.cri.sandbox-id: 18a7d2fd310f9d92b2ebf495f50bd5f87a3d8a40bd048432493742f456325804
2023/11/29 13:48:42 [adaptor/proxy] io.katacontainers.pkg.oci.container_type: pod_sandbox
2023/11/29 13:48:42 [adaptor/proxy] Pulling image separately not support on main
2023/11/29 13:48:43 [adaptor/proxy] CreateContainer fails: rpc error: code = Internal desc = failed to mount /run/kata-containers/shared/containers/18a7d2fd310f9d92b2ebf495f50bd5f87a3d8a40bd048432493742f456325804/rootfs to /run/kata-containers/18a7d2fd310f9d92b2ebf495f50bd5f87a3d8a40bd048432493742f456325804/rootfs, with error: ENOENT: No such file or directory
It also shows that nydus options aren't being set. This is for at least two reasons:
I think we are now blocked until these steps before we can go much further, but hopefully my kata runtime PR: https://github.com/kata-containers/kata-containers/pull/8520/commits can be merged in the mean time
We've managed to get some of the steps required for this upstreamed now:
Remaining issues that will need resolving before we can be unblocked here (and there might be more after)
Just an update on this - now the agent supports image pull on guest I've done a bunch of PoC work on this to push us further in the right direction. The current place we are at is that nydus_snapshotter isn't putting the correct annotation into the storage driver for us to pull on guest. Fabiano is also seeing this on local hypervisor, so hopefully we can work it out between us...
The problem we were hitting in noted in Issue 4 here: https://github.com/kata-containers/kata-containers/issues/8407#issuecomment-2049144827
I ran the following script, kindly provided by Fabiano on the worker:
test_images_to_remove=(
"docker.io/rancher/mirrored-pause"
"registry.k8s.io/pause"
"quay.io/sjenning/nginx"
"quay.io/prometheus/busybox"
"quay.io/confidential-containers/test-images"
)
ctr_args=""
if [ "${KUBERNETES}" = "k3s" ]; then
ctr_args="--address /run/k3s/containerd/containerd.sock "
fi
ctr_args+="--namespace k8s.io"
ctr_command="sudo -E ctr ${ctr_args}"
for related_image in "${test_images_to_remove[@]}"; do
# We need to delete related image
image_list=($(${ctr_command} i ls -q |grep "$related_image" |awk '{print $1}'))
if [ "${#image_list[@]}" -gt 0 ]; then
for image in "${image_list[@]}"; do
${ctr_command} i remove "$image"
done
fi
# We need to delete related content of image
IFS="/" read -ra parts <<< "$related_image";
repository="${parts[0]}";
image_name="${parts[1]}";
formatted_image="${parts[0]}=${parts[-1]}"
image_contents=($(${ctr_command} content ls | grep "${formatted_image}" | awk '{print $1}'))
if [ "${#image_contents[@]}" -gt 0 ]; then
for content in $image_contents; do
${ctr_command} content rm "$content"
done
fi
done
and after that the image pull on host worked and the container is up and running:
2024/04/11 09:50:38 [adaptor/proxy] storages:
2024/04/11 09:50:38 [adaptor/proxy] mount_point:/run/kata-containers/7ce95eeeb93faef7640d7640c0d46ddd5128b4c81904478fc80ab945fedc4b56/rootfs source:docker.io/library/nginx:latest fstype:overlay driver:image_guest_pull
# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 76s
so we just need a way to do this more easily on the worker node...
Ok, I'm going to try and describe how to reliably and reproducible set up a dev environment for this for testing. I'm using libvirt with a kcli cluster and an pretty chunky 16 vCPU 32GRAM VM, but I'm not sure that is strictly required to be that large. I will also do some steps like pushing the podvm image that are just so people following that can use my images and save some time:
export GOPATH="${HOME}/go"
cloud_api_adaptor_repo="github.com/confidential-containers/cloud-api-adaptor"
cloud_api_adaptor_dir="${GOPATH}/src/${cloud_api_adaptor_repo}"
mkdir -p $(dirname "${cloud_api_adaptor_dir}")
git clone -b main "https://${cloud_api_adaptor_repo}.git" "${cloud_api_adaptor_dir}"
pushd $cloud_api_adaptor_dir/src/cloud-api-adaptor
git remote add sh https://github.com/stevenhorsman/cloud-api-adaptor.git
git fetch sh
git checkout -b kata-runtime-bump sh/kata-runtime-bump
./libvirt/config_libvirt.sh
./libvirt/kcli_cluster.sh create
export KUBECONFIG=$HOME/.kcli/clusters/peer-pods/auth/kubeconfig
sudo snap install yq
yq --version
echo "Install docker"
sudo snap install docker
sudo systemctl start snap.docker.dockerd
sudo systemctl enable snap.docker.dockerd
make podvm-builder podvm-binaries podvm-image
docker image tag quay.io/confidential-containers/podvm-generic-ubuntu-amd64 quay.io/stevenhorsman/podvm-generic-ubuntu-amd64
docker image tag quay.io/confidential-containers/podvm-generic-ubuntu-amd64:c392670e2401a08956ed4f52c4152209617bcde872a71b0a3b87da86b8fad2dc quay.io/stevenhorsman/podvm-generic-ubuntu-amd64:c392670e2401a08956ed4f52c4152209617bcde872a71b0a3b87da86b8fad2dc
docker login quay.io
docker push quay.io/stevenhorsman/podvm-generic-ubuntu-amd64:c392670e2401a08956ed4f52c4152209617bcde872a71b0a3b87da86b8fad2dc
pushd podvm
./hack/download-image.sh ghcr.io/confidential-containers/podvm-generic-ubuntu-amd64:ci-pr1754 . -o podvm.qcow2
popd
export IMAGE="${PWD}/podvm/podvm.qcow2"
ls -al $IMAGE
virsh -c qemu:///system vol-create-as --pool default --name podvm-base.qcow2 --capacity 20G --allocation 2G --prealloc-metadata --format qcow2
virsh -c qemu:///system vol-upload --vol podvm-base.qcow2 $IMAGE --pool default --sparse
virsh -c qemu:///system vol-info --pool default podvm-base.qcow2
export LIBVIRT_IP="192.168.122.1"
export SSH_KEY_FILE="id_rsa"
./libvirt/install_operator.sh
You should now have pods:
# kubectl get pods -n confidential-containers-system
NAME READY STATUS RESTARTS AGE
cc-operator-controller-manager-767d88bbb4-9ppb6 2/2 Running 0 2m42s
cc-operator-daemon-install-6kxb5 0/1 ContainerCreating 0 103s
cc-operator-pre-install-daemon-zsvk6 1/1 Running 0 2m21s
cloud-api-adaptor-daemonset-dvskj 1/1 Running 0 2m42s
peerpodconfig-ctrl-caa-daemon-v6mvd 1/1 Running 0 2m21s
Note: with the new operator peer pod approach we know have two CAA pods - the normal one and peerpodconfig-ctrl. This needs resolving, but in the short term we will edit both to use our CAA image
quay.io/stevenhorsman/cloud-api-adaptor:dev-c7c48d677c2c1f4c7f4085c4e663d4a10daf1b2f-dirty
docker login quay.io
registry=quay.io/stevenhorsman make image
quay.io/stevenhorsman/cloud-api-adaptor:dev-c7c48d677c2c1f4c7f4085c4e663d4a10daf1b2f-dirty
image
kubectl edit ds/peerpodconfig-ctrl-caa-daemon -n confidential-containers-system
kubectl edit ds/cloud-api-adaptor-daemonset -n confidential-containers-system
kcli ssh peer-pods-worker-0
test_images_to_remove=( "docker.io/rancher/mirrored-pause" "registry.k8s.io/pause" "quay.io/sjenning/nginx" "quay.io/prometheus/busybox" "quay.io/confidential-containers/test-images" )
ctr_args=""
if [ "${KUBERNETES}" = "k3s" ]; then
ctr_args="--address /run/k3s/containerd/containerd.sock "
fi
ctr_args+="--namespace k8s.io"
ctr_command="sudo -E ctr ${ctr_args}"
for related_image in "${test_images_to_remove[@]}"; do
# We need to delete related image
image_list=($(${ctr_command} i ls -q |grep "$related_image" |awk '{print $1}'))
if [ "${#image_list[@]}" -gt 0 ]; then
for image in "${image_list[@]}"; do
${ctr_command} i remove "$image"
done
fi
# We need to delete related content of image
IFS="/" read -ra parts <<< "$related_image";
repository="${parts[0]}";
image_name="${parts[1]}";
formatted_image="${parts[0]}=${parts[-1]}"
image_contents=($(${ctr_command} content ls | grep "${formatted_image}" | awk '{print $1}'))
if [ "${#image_contents[@]}" -gt 0 ]; then
for content in $image_contents; do
${ctr_command} content rm "$content"
done
fi
done
- Test it by creating a peer pod:
echo ' apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name: nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx annotations: io.containerd.cri.runtime-handler: kata-remote spec: runtimeClassName: kata-remote containers:
After waiting a little while you should see the pod running:
NAME READY STATUS RESTARTS AGE nginx-5bb58f7796-lqvr5 0/1 ContainerCreating 0 11s nginx-5bb58f7796-lqvr5 1/1 Running 0 57s
- We can check the CAA logs to see that it was pulled correctly (depending on the order you might need to check the peerpodctrl version of the CAA ds):
kubectl delete deployment nginx
After chatting to Pradipta we've realised that we need to change and remove all the CAA install/kustomize's installation of the caa-pod now that the peerpodconfig-ctrl is deploying it. There is a lot of references to it, so I'm trying to go through and unpick and provide alternatives to this...
After a bunch of updates to resolve the double CAA ds the latest instructions are a bit more simplified:
I'm going to try and describe how to reliably and reproducible set up a dev environment for this for testing. I'm using libvirt with a kcli cluster and an pretty chunky 16 vCPU 32GRAM VM, but I'm not sure that is strictly required to be that large. I will also do some steps like pushing the podvm image that are just so people following that can use my images and save some time:
Clone my repo (edit this to pick your own if you are based on my branch)
export GOPATH="${HOME}/go"
cloud_api_adaptor_repo="github.com/confidential-containers/cloud-api-adaptor"
cloud_api_adaptor_dir="${GOPATH}/src/${cloud_api_adaptor_repo}"
mkdir -p $(dirname "${cloud_api_adaptor_dir}")
git clone -b main "https://${cloud_api_adaptor_repo}.git" "${cloud_api_adaptor_dir}"
pushd $cloud_api_adaptor_dir/src/cloud-api-adaptor
git remote add sh https://github.com/stevenhorsman/cloud-api-adaptor.git
git fetch sh
git checkout -b kata-runtime-bump sh/kata-runtime-bump
Setup libvirt and create a kcli cluster
./libvirt/config_libvirt.sh
./libvirt/kcli_cluster.sh create
export KUBECONFIG=$HOME/.kcli/clusters/peer-pods/auth/kubeconfig
Install some pre-reqs
echo "Install docker"
sudo snap install docker
sudo systemctl start snap.docker.dockerd
sudo systemctl enable snap.docker.dockerd
Build and publish a podvm image (you can skip this and use mine if you want)
make podvm-builder podvm-binaries podvm-image
docker image tag quay.io/confidential-containers/podvm-generic-ubuntu-amd64:c392670e2401a08956ed4f52c4152209617bcde872a71b0a3b87da86b8fad2dc quay.io/stevenhorsman/podvm-generic-ubuntu-amd64:c392670e2401a08956ed4f52c4152209617bcde872a71b0a3b87da86b8fad2dc
docker login quay.io
docker push quay.io/stevenhorsman/podvm-generic-ubuntu-amd64:c392670e2401a08956ed4f52c4152209617bcde872a71b0a3b87da86b8fad2dc
Download the podvm qcow2
pushd podvm
./hack/download-image.sh ghcr.io/confidential-containers/podvm-generic-ubuntu-amd64:ci-pr1754 . -o podvm.qcow2
popd
Prepare the libvirt podvm volume
export IMAGE="${PWD}/podvm/podvm.qcow2"
ls -al $IMAGE
virsh -c qemu:///system vol-create-as --pool default --name podvm-base.qcow2 --capacity 20G --allocation 2G --prealloc-metadata --format qcow2
virsh -c qemu:///system vol-upload --vol podvm-base.qcow2 $IMAGE --pool default --sparse
virsh -c qemu:///system vol-info --pool default podvm-base.qcow2
Create a new CAA image (this this optional - you can use mine: ghcr.io/confidential-containers/cloud-api-adaptor:ci-pr1754-dev
)
docker login quay.io
registry=quay.io/stevenhorsman make image
Install the operator and libvirt CAA
export CAA_IMAGE="ghcr.io/confidential-containers/cloud-api-adaptor:ci-pr1754-dev"
export SSH_KEY_FILE="id_rsa"
./libvirt/install_operator.sh
After waiting a little while, you should now have pods:
# kubectl get pods -n confidential-containers-system
NAME READY STATUS RESTARTS AGE
cc-operator-controller-manager-7b6c5f84bf-nlqjt 2/2 Running 0 81s
cc-operator-daemon-install-mxxlk 0/1 ContainerCreating 0 23s
cc-operator-pre-install-daemon-sx4p8 1/1 Running 0 44s
peerpodconfig-ctrl-caa-daemon-j7chc 1/1 Running 0 44s
Log into the worker nodes and clean the pause image cache:
kcli ssh peer-pods-worker-0
test_images_to_remove=( "docker.io/rancher/mirrored-pause" "registry.k8s.io/pause" "quay.io/sjenning/nginx" "quay.io/prometheus/busybox" "quay.io/confidential-containers/test-images" )
ctr_args="" if [ "${KUBERNETES}" = "k3s" ]; then ctr_args="--address /run/k3s/containerd/containerd.sock " fi ctr_args+="--namespace k8s.io" ctr_command="sudo -E ctr ${ctr_args}" for related_image in "${test_images_to_remove[@]}"; do
image_list=($(${ctr_command} i ls -q |grep "$related_image" |awk '{print $1}'))
if [ "${#image_list[@]}" -gt 0 ]; then
for image in "${image_list[@]}"; do
${ctr_command} i remove "$image"
done
fi
# We need to delete related content of image
IFS="/" read -ra parts <<< "$related_image";
repository="${parts[0]}";
image_name="${parts[1]}";
formatted_image="${parts[0]}=${parts[-1]}"
image_contents=($(${ctr_command} content ls | grep "${formatted_image}" | awk '{print $1}'))
if [ "${#image_contents[@]}" -gt 0 ]; then
for content in $image_contents; do
${ctr_command} content rm "$content"
done
fi
done
exit
- Test it by creating a peer pod:
echo ' apiVersion: v1 kind: Pod metadata: name: nginx annotations: io.containerd.cri.runtime-handler: kata-remote labels: app: nginx spec: runtimeClassName: kata-remote containers:
After waiting a little while you should see the pod running:
NAME READY STATUS RESTARTS AGE nginx-5bb58f7796-lqvr5 0/1 ContainerCreating 0 11s nginx-5bb58f7796-lqvr5 1/1 Running 0 57s
- We can check the CAA logs to see that it was pulled correctly:
$ kubectl logs ds/peerpodconfig-ctrl-caa-daemon -n confidential-containers-systemem
kubectl delete pod nginx
Hi @stevenhorsman I can confirm this is working with containerd cluster. I was able to reproduce it.
Unfortunately not true with cri-o. :(
$ kubectl logs pod/cc-operator-pre-install-daemon-nmmhs -n confidential-containers-system
INSTALL_COCO_CONTAINERD: false
INSTALL_OFFICIAL_CONTAINERD: false
INSTALL_VFIO_GPU_CONTAINERD: false
INSTALL_NYDUS_SNAPSHOTTER: true
ERROR: cri-o is not yet supported
It looks like the operator is not fully supporting cri-o yet.
A small update here - I've created new images with the 3.4 release of kata-containers: quay.io/stevenhorsman/cloud-api-adaptor:dev-040e9bef8fdb4e2ed94cf68ae04e89076f7b3249 quay.io/stevenhorsman/podvm-generic-ubuntu-amd64:9def86b85afa60d06146608c1f27c74a0568534540f0cc2d60c08cb046cfa0db
This is going okay and I have some e2e passes locally, but can't get them on the e2e CI as the pull_request_target
only pulls from the main
branch, so we first have to try and code into the workflow that should work and I've spun up https://github.com/confidential-containers/cloud-api-adaptor/pull/1828 to do this.
I also can't fully test it on my fork as it runs on the azure runner that I can't access.
The workflow PR https://github.com/confidential-containers/cloud-api-adaptor/pull/1828 has been merged, so hopefully if I rebase the kata-runtime-bump
branch on that we might be able to get some e2e tests runnings in the PR.
As part of the merge to main effort, we have https://github.com/kata-containers/kata-containers/pull/7046 which is adding the remote hypervisor feature to the kata runtime. Once this is merge we should test out whether the CAA can re-vendor on it and see what issues there are. We also know that as part of these changes we want to remove the gogoprotobuf workaround https://github.com/confidential-containers/cloud-api-adaptor/blob/eb1b368f84825bb83b9033f07228e04cdef3ceb1/go.mod#L162-L164 and align with kata where we had to do the same thing.