Closed wainersm closed 11 months ago
Kata Containers for CoCo 0.8.0 RC1 - commit 424de1cbfa4e1da9ecf9a56b1d1e1a11a4f339cd
guest-components for CoCo 0.8.0 RC1 - commit 615a46ff16ee8670014946d14e44e85cede82f01 (same commit pinned on kata's 424de1cbfa4e1da9ecf9a56b1d1e1a11a4f339cd)
I've raised PR https://github.com/confidential-containers/cloud-api-adaptor/pull/1559
To cover this sections:
Update the csi-wrapper and peerpod-ctrl go modules to use the tagged version of cloud-api-adapter, by running:
go get github.com/confidential-containers/cloud-api-adaptor@v<version>-alpha.1 go mod tidy
in their directories and removing the local replace references if we needed to add them earlier.
of the release doc
I've created pre-release https://github.com/confidential-containers/cloud-api-adaptor/releases/tag/v0.8.0-alpha.1 now, which should trigger the podvm build process to test that.
As reported on Slack, I've manually tested the RC podvm and CAA code for IBM Cloud with amd64 and s390x clusters/peer pods
I tested azure podvm & caa images built on the v0.8.0-alpha.1
rev. looks good, also remote attestation seems to work.
I've run the libvirt e2e tests too:
time="2023-11-06T08:28:06-08:00" level=info msg="Install the cloud-api-adaptor"
Wait for the cc-operator-daemon-install DaemonSet be available
Wait for the pod cc-operator-daemon-install-bbzcq be ready
Wait for the cloud-api-adaptor-daemonset DaemonSet be available
Wait for the pod cloud-api-adaptor-daemonset-dgqt9 be ready
Wait for the kata-remote runtimeclass be created
=== RUN TestLibvirtCreateSimplePod
=== RUN TestLibvirtCreateSimplePod/SimplePeerPod_test
assessment_runner_test.go:202: Expected Pod State: Running
assessment_runner_test.go:203: Current Pod State: Running
=== RUN TestLibvirtCreateSimplePod/SimplePeerPod_test/PodVM_is_created
assessment_helpers_test.go:159: Pulled with nydus-snapshotter driver:2023/11/06 16:35:13 [adaptor/proxy] mount_point:/run/kata-containers/77b955c79d3b1d8338e08a03b2257f75da8f87d59f064d518eb629135b19eebf/rootfs source:docker.io/library/nginx:latest fstype:overlay driver:image_guest_pull
time="2023-11-06T08:35:36-08:00" level=info msg="Deleting pod nginx..."
time="2023-11-06T08:35:51-08:00" level=info msg="Pod nginx has been successfully deleted within 60s"
--- PASS: TestLibvirtCreateSimplePod (162.18s)
--- PASS: TestLibvirtCreateSimplePod/SimplePeerPod_test (162.16s)
--- PASS: TestLibvirtCreateSimplePod/SimplePeerPod_test/PodVM_is_created (1.52s)
=== RUN TestLibvirtCreatePodWithConfigMap
=== RUN TestLibvirtCreatePodWithConfigMap/ConfigMapPeerPod_test
assessment_runner_test.go:202: Expected Pod State: Running
assessment_runner_test.go:203: Current Pod State: Running
=== RUN TestLibvirtCreatePodWithConfigMap/ConfigMapPeerPod_test/Configmap_is_created_and_contains_data
time="2023-11-06T08:37:22-08:00" level=info msg="Data Inside Configmap: Hello, world"
time="2023-11-06T08:37:22-08:00" level=info msg="Deleting Configmap... nginx-configmap"
time="2023-11-06T08:37:22-08:00" level=info msg="Deleting pod nginx-configmap-pod..."
time="2023-11-06T08:37:32-08:00" level=info msg="Pod nginx-configmap-pod has been successfully deleted within 60s"
--- PASS: TestLibvirtCreatePodWithConfigMap (101.40s)
--- PASS: TestLibvirtCreatePodWithConfigMap/ConfigMapPeerPod_test (101.39s)
--- PASS: TestLibvirtCreatePodWithConfigMap/ConfigMapPeerPod_test/Configmap_is_created_and_contains_data (5.71s)
=== RUN TestLibvirtCreatePodWithSecret
=== RUN TestLibvirtCreatePodWithSecret/SecretPeerPod_test
assessment_runner_test.go:202: Expected Pod State: Running
assessment_runner_test.go:203: Current Pod State: Running
=== RUN TestLibvirtCreatePodWithSecret/SecretPeerPod_test/Secret_has_been_created_and_contains_data
time="2023-11-06T08:38:58-08:00" level=info msg="Username from secret inside pod: admin"
time="2023-11-06T08:39:03-08:00" level=info msg="Password from secret inside pod: password"
time="2023-11-06T08:39:03-08:00" level=info msg="Deleting Secret... nginx-secret"
time="2023-11-06T08:39:03-08:00" level=info msg="Deleting pod nginx-secret-pod..."
time="2023-11-06T08:39:13-08:00" level=info msg="Pod nginx-secret-pod has been successfully deleted within 60s"
--- PASS: TestLibvirtCreatePodWithSecret (101.04s)
--- PASS: TestLibvirtCreatePodWithSecret/SecretPeerPod_test (101.03s)
--- PASS: TestLibvirtCreatePodWithSecret/SecretPeerPod_test/Secret_has_been_created_and_contains_data (10.60s)
=== RUN TestLibvirtCreatePeerPodContainerWithExternalIPAccess
=== RUN TestLibvirtCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test
assessment_runner_test.go:202: Expected Pod State: Running
assessment_runner_test.go:203: Current Pod State: Running
=== RUN TestLibvirtCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test/Peer_Pod_Container_Connected_to_External_IP
time="2023-11-06T08:40:29-08:00" level=info msg="Output of ping command in busybox : PING www.google.com (142.251.36.36): 56 data bytes\n64 bytes from 142.251.36.36: seq=0 ttl=48 time=15.478 ms\n\n--- www.google.com ping statistics ---\n1 packets transmitted, 1 packets received, 0% packet loss\nround-trip min/avg/max = 15.478/15.478/15.478 ms\n"
time="2023-11-06T08:40:29-08:00" level=info msg="Deleting pod busybox-pod..."
time="2023-11-06T08:40:34-08:00" level=info msg="Pod busybox-pod has been successfully deleted within 60s"
--- PASS: TestLibvirtCreatePeerPodContainerWithExternalIPAccess (80.55s)
--- PASS: TestLibvirtCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test (80.54s)
--- PASS: TestLibvirtCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test/Peer_Pod_Container_Connected_to_External_IP (5.38s)
=== RUN TestLibvirtCreatePeerPodWithJob
=== RUN TestLibvirtCreatePeerPodWithJob/JobPeerPod_test
=== RUN TestLibvirtCreatePeerPodWithJob/JobPeerPod_test/Job_has_been_created
assessment_helpers_test.go:239: WARNING: job-pi-78ph2 - StartError
assessment_helpers_test.go:264: SUCCESS: job-pi-hg9hs - Completed - LOG: 3.14156
assessment_runner_test.go:226: Expected Completed status on first attempt
time="2023-11-06T08:42:49-08:00" level=info msg="Deleting Job... job-pi"
time="2023-11-06T08:42:49-08:00" level=info msg="Deleting pods created by job... job-pi-78ph2"
time="2023-11-06T08:42:49-08:00" level=info msg="Deleting pods created by job... job-pi-hg9hs"
--- PASS: TestLibvirtCreatePeerPodWithJob (135.49s)
--- PASS: TestLibvirtCreatePeerPodWithJob/JobPeerPod_test (135.49s)
--- SKIP: TestLibvirtCreatePeerPodWithJob/JobPeerPod_test/Job_has_been_created (0.15s)
=== RUN TestLibvirtCreatePeerPodAndCheckUserLogs
common_suite_test.go:154: Skipping Test until issue kata-containers/kata-containers#5732 is Fixed
--- SKIP: TestLibvirtCreatePeerPodAndCheckUserLogs (0.00s)
=== RUN TestLibvirtCreatePeerPodAndCheckWorkDirLogs
=== RUN TestLibvirtCreatePeerPodAndCheckWorkDirLogs/WorkDirPeerPod_test
=== RUN TestLibvirtCreatePeerPodAndCheckWorkDirLogs/WorkDirPeerPod_test/Peer_pod_with_work_directory_has_been_created
assessment_runner_test.go:260: Log output of peer pod:/other
time="2023-11-06T08:44:04-08:00" level=info msg="Deleting pod workdirpod..."
time="2023-11-06T08:44:09-08:00" level=info msg="Pod workdirpod has been successfully deleted within 60s"
--- PASS: TestLibvirtCreatePeerPodAndCheckWorkDirLogs (80.17s)
--- PASS: TestLibvirtCreatePeerPodAndCheckWorkDirLogs/WorkDirPeerPod_test (80.17s)
--- PASS: TestLibvirtCreatePeerPodAndCheckWorkDirLogs/WorkDirPeerPod_test/Peer_pod_with_work_directory_has_been_created (5.04s)
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly/EnvVariablePeerPodWithImageOnly_test
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly/EnvVariablePeerPodWithImageOnly_test/Peer_pod_with_environmental_variables_has_been_created
assessment_runner_test.go:260: Log output of peer pod:KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.96.0.1:443
HOSTNAME=env-variable-in-image
SHLVL=1
HOME=/root
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
ISPRODUCTION=false
KUBERNETES_SERVICE_HOST=10.96.0.1
PWD=/
time="2023-11-06T08:45:24-08:00" level=info msg="Deleting pod env-variable-in-image..."
time="2023-11-06T08:45:29-08:00" level=info msg="Pod env-variable-in-image has been successfully deleted within 60s"
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly (80.16s)
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly/EnvVariablePeerPodWithImageOnly_test (80.16s)
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly/EnvVariablePeerPodWithImageOnly_test/Peer_pod_with_environmental_variables_has_been_created (5.03s)
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly/EnvVariablePeerPodWithDeploymentOnly_test
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly/EnvVariablePeerPodWithDeploymentOnly_test/Peer_pod_with_environmental_variables_has_been_created
assessment_runner_test.go:260: Log output of peer pod:KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.96.0.1:443
HOSTNAME=env-variable-in-config
HOME=/root
PKG_RELEASE=1~bookworm
NGINX_VERSION=1.25.3
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
NJS_VERSION=0.8.2
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
ISPRODUCTION=true
KUBERNETES_SERVICE_HOST=10.96.0.1
PWD=/
time="2023-11-06T08:46:55-08:00" level=info msg="Deleting pod env-variable-in-config..."
time="2023-11-06T08:47:00-08:00" level=info msg="Pod env-variable-in-config has been successfully deleted within 60s"
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly (90.34s)
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly/EnvVariablePeerPodWithDeploymentOnly_test (90.34s)
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly/EnvVariablePeerPodWithDeploymentOnly_test/Peer_pod_with_environmental_variables_has_been_created (5.17s)
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment/EnvVariablePeerPodWithBoth_test
=== RUN TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment/EnvVariablePeerPodWithBoth_test/Peer_pod_with_environmental_variables_has_been_created
assessment_runner_test.go:260: Log output of peer pod:KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_SERVICE_PORT=443
HOSTNAME=env-variable-in-both
SHLVL=1
HOME=/root
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
ISPRODUCTION=true
KUBERNETES_SERVICE_HOST=10.96.0.1
PWD=/
time="2023-11-06T08:48:15-08:00" level=info msg="Deleting pod env-variable-in-both..."
time="2023-11-06T08:48:20-08:00" level=info msg="Pod env-variable-in-both has been successfully deleted within 60s"
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment (80.25s)
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment/EnvVariablePeerPodWithBoth_test (80.25s)
--- PASS: TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment/EnvVariablePeerPodWithBoth_test/Peer_pod_with_environmental_variables_has_been_created (5.09s)
PASS
ok github.com/confidential-containers/cloud-api-adaptor/test/e2e 1879.673s
@wainersm @bpradipt - do you know of any other tests we need to do before we can say that peer pods 0.8.0 alpha.1 testing is completed
hmm, when running the e2e test suite, other than in my manual test, i'm seeing snapshotter errors:
Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull
and unpack image "docker.io/library/nginx:latest": failed to prepare
extraction snapshot "extract-218425411-EbOv
sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a":
missing CRI reference annotation for snaposhot 8: unknown
Note the typo "snaposhot", that's from nydus apparently
@stevenhorsman managing to run the alpha.1 release with AWS
@stevenhorsman managing to run the alpha.1 release with AWS
The simple pod failed on AWS (AKS) and remaining tests didn't run because it hit the timeout:
=== RUN TestAwsCreateSimplePod
=== RUN TestAwsCreateSimplePod/SimplePeerPod_test
assessment_runner_test.go:190: timed out waiting for the condition
--- FAIL: TestAwsCreateSimplePod (900.53s)
--- FAIL: TestAwsCreateSimplePod/SimplePeerPod_test (900.53s)
I will get more information.
FYI @bpradipt
@stevenhorsman managing to run the alpha.1 release with AWS
The simple pod failed on AWS (AKS) and remaining tests didn't run because it hit the timeout:
=== RUN TestAwsCreateSimplePod === RUN TestAwsCreateSimplePod/SimplePeerPod_test assessment_runner_test.go:190: timed out waiting for the condition --- FAIL: TestAwsCreateSimplePod (900.53s) --- FAIL: TestAwsCreateSimplePod/SimplePeerPod_test (900.53s)
@wainersm - Hey Wainer, does your cluster have multiple worker nodes by any chance. I've just realised that my nydus verification tests just runs on the first CAA ds it finds, so wouldn't be reliable on a mutli node cluster. I'm looking to fix it soon, but can back out the test failure in the short-term if it's causing issues?
@stevenhorsman managing to run the alpha.1 release with AWS
The simple pod failed on AWS (AKS) and remaining tests didn't run because it hit the timeout:
=== RUN TestAwsCreateSimplePod === RUN TestAwsCreateSimplePod/SimplePeerPod_test assessment_runner_test.go:190: timed out waiting for the condition --- FAIL: TestAwsCreateSimplePod (900.53s) --- FAIL: TestAwsCreateSimplePod/SimplePeerPod_test (900.53s)
@wainersm - Hey Wainer, does your cluster have multiple worker nodes by any chance. I've just realised that my nydus verification tests just runs on the first CAA ds it finds, so wouldn't be reliable on a mutli node cluster. I'm looking to fix it soon, but can back out the test failure in the short-term if it's causing issues?
Hi @stevenhorsman , it is a single-node cluster but good to know tests aren't reliable on a multi-node cluster!
The problem is actually the region the framework deployed the cluster does not support confidential VM, which is now the default since commit e4059a5223bf4a955391b7f87178af9b11809dc2 . I will disable CVM and see if it passes the simple tests.
it is a single-node cluster but good to know tests aren't reliable on a multi-node cluster!
I've created https://github.com/confidential-containers/cloud-api-adaptor/pull/1562 which I hope fixes the issue
@bpradipt @stevenhorsman the status of my tests for AWS is:
DISABLECVM=true
)INSTALL_OFFICIAL_CONTAINERD=true
which forced the operator to superseed the system's containerd with its own 1.7Warning Failed 48m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-101183212-RW23 sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 3: unknown
Warning Failed 48m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-967947483-KBeo sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 4: unknown
Warning Failed 47m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-758725951-sRqW sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 5: unknown
Does it ring a bell?
My cluster was gone overnight and I am working to re-install it to obtain more information.
- Now the "simple pod" fails to start with a nydus related error:
Warning Failed 48m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-101183212-RW23 sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 3: unknown Warning Failed 48m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-967947483-KBeo sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 4: unknown Warning Failed 47m kubelet
Oh, that's relieving to see 😅
but are we talking about AKS or EKS (on AWS)? AKS clusters should bundle containerd 1.7, I think?
- Now the "simple pod" fails to start with a nydus related error:
Warning Failed 48m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-101183212-RW23 sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 3: unknown Warning Failed 48m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-967947483-KBeo sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 4: unknown Warning Failed 47m kubelet
Oh, that's relieving to see 😅
but are we talking about AKS or EKS (on AWS)? AKS clusters should bundle containerd 1.7, I think?
Sorry, I meant AWS EKS :)
- Now the "simple pod" fails to start with a nydus related error:
Warning Failed 48m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-101183212-RW23 sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 3: unknown Warning Failed 48m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to prepare extraction snapshot "extract-967947483-KBeo sha256:ec983b16636050e69677eb81537e955ab927757c23aaf73971ecf5f71fcc262a": missing CRI reference annotation for snaposhot 4: unknown Warning Failed 47m kubelet
Oh, that's relieving to see 😅 but are we talking about AKS or EKS (on AWS)? AKS clusters should bundle containerd 1.7, I think?
Sorry, I meant AWS EKS :)
Yesterday I tried again on AWS EKS on two versions (1.26 and 1.28) of kubernetes; I got the same missing CRI reference annotation for snaposhot
error on both cases.
Then I deployed a kubeadm cluster via kcli on AWS with CentOS Stream 8 workers. This time I wasn't even able to get the podvm properly runnings (it fails on the "initializing" checks). This is the first time I tried that installation method it might be that I made some mistake.
@mkulke any luck with AKS?
@wainersm: not yet, I'm trying to establish a working baseline with CLOUD_PROVIDER=libvirt on a single-node cluster. kcli also gave me grief, so I aborted that approach and i'm trying to deploy libvirt on single-node cluster created with kubeadm. It will start VMs, but the podvm do not seem to come up properly.
Do you think it makes sense to reference the missing CRI reference annotation for snaposhot
to the tracking issue for remote-snapshotter on CC, now that we have this somewhat confirmed?
@wainersm: not yet, I'm trying to establish a working baseline with CLOUD_PROVIDER=libvirt on a single-node cluster. kcli also gave me grief, so I aborted that approach and i'm trying to deploy libvirt on single-node cluster created with kubeadm. It will start VMs, but the podvm do not seem to come up properly.
Do you think it makes sense to reference the
missing CRI reference annotation for snaposhot
to the tracking issue for remote-snapshotter on CC, now that we have this somewhat confirmed?
Good idea. let me add an entry on that for the peer pod issue.
@wainersm: not yet, I'm trying to establish a working baseline with CLOUD_PROVIDER=libvirt on a single-node cluster. kcli also gave me grief, so I aborted that approach and i'm trying to deploy libvirt on single-node cluster created with kubeadm. It will start VMs, but the podvm do not seem to come up properly. Do you think it makes sense to reference the
missing CRI reference annotation for snaposhot
to the tracking issue for remote-snapshotter on CC, now that we have this somewhat confirmed?Good idea. let me add an entry on that for the peer pod issue.
Done :)
Marked AKS as passing, also manually tested image decryption with snapshotter pulling, works as intended.
I've created https://github.com/confidential-containers/cloud-api-adaptor/pull/1570 to bump the versions to the 0.8 release
After the release I re-tested everything from scratch locally. For IBM cloud on s390x every test passed. For libvirt on kcli on x86 they all passed except nydus pull:
=== RUN TestLibvirtCreateSimplePodWithNydusAnnotation/SimplePeerPod_test/PodVM_is_created
assessment_helpers_test.go:162: Called PullImage explicitly, not using nydus-snapshotter :2023/11/14 15:08:25 [adaptor/proxy] CreateContainer: calling PullImage for "docker.io/library/alpine:latest" before CreateContainer (cid: "697963694d8aed2171cd3ea83759e0893c9f752b63c0e4047ffba26fc0c8e97d")
assessment_runner_test.go:370: Expected to pull with nydus, but that didn't happen
time="2023-11-14T07:08:34-08:00" level=info msg="Deleting pod alpine..."
time="2023-11-14T07:08:44-08:00" level=info msg="Pod alpine has been successfully deleted within 60s"
--- FAIL: TestLibvirtCreateSimplePodWithNydusAnnotation (117.37s)
--- FAIL: TestLibvirtCreateSimplePodWithNydusAnnotation/SimplePeerPod_test (117.36s)
--- FAIL: TestLibvirtCreateSimplePodWithNydusAnnotation/SimplePeerPod_test/PodVM_is_created (1.57s)
I've manually checked and we aren't using the nydus pull with libvirt, but given that we've called that experimental I'm not sure if that is a problem at this point?
After the release I re-tested everything from scratch locally. For IBM cloud on s390x every test passed. For libvirt on kcli on x86 they all passed except nydus pull:
=== RUN TestLibvirtCreateSimplePodWithNydusAnnotation/SimplePeerPod_test/PodVM_is_created assessment_helpers_test.go:162: Called PullImage explicitly, not using nydus-snapshotter :2023/11/14 15:08:25 [adaptor/proxy] CreateContainer: calling PullImage for "docker.io/library/alpine:latest" before CreateContainer (cid: "697963694d8aed2171cd3ea83759e0893c9f752b63c0e4047ffba26fc0c8e97d") assessment_runner_test.go:370: Expected to pull with nydus, but that didn't happen time="2023-11-14T07:08:34-08:00" level=info msg="Deleting pod alpine..." time="2023-11-14T07:08:44-08:00" level=info msg="Pod alpine has been successfully deleted within 60s" --- FAIL: TestLibvirtCreateSimplePodWithNydusAnnotation (117.37s) --- FAIL: TestLibvirtCreateSimplePodWithNydusAnnotation/SimplePeerPod_test (117.36s) --- FAIL: TestLibvirtCreateSimplePodWithNydusAnnotation/SimplePeerPod_test/PodVM_is_created (1.57s)
I've manually checked and we aren't using the nydus pull with libvirt, but given that we've called that experimental I'm not sure if that is a problem at this point?
Hi Steve, currently kcli deploys Ubuntu 20.04 nodes with containerd 1.6. As we are not setting INSTALL_OFFICIAL_CONTAINERD=true
it is not replaced with containerd 1.7, thus nydus-snapshotter doesn't work. Let me send a PR to update the scripts to use ubuntu 22.04... hopefully it will work out of box.
Let me send a PR to update the scripts to use ubuntu 22.04... hopefully it will work out of box.
@wainersm - FYI - I'm testing https://github.com/stevenhorsman/cloud-api-adaptor/tree/containerd-22.04-switch locally at the moment, but if you have already created this PR then let me know and I can switch to it
Let me send a PR to update the scripts to use ubuntu 22.04... hopefully it will work out of box.
@wainersm - FYI - I'm testing https://github.com/stevenhorsman/cloud-api-adaptor/tree/containerd-22.04-switch locally at the moment, but if you have already created this PR then let me know and I can switch to it
I didn't create the PR because yesterday I couldn't setup the cluster with Ubuntu 22.04 due to this bug: https://github.com/karmab/kcli/issues/615 . I still don't know how to test the fix, whether kcli has nightly builds or not. Do you know? Can you create the cluster with the kcli version you have installed?
Ah, let's use your PR ;)
Can you create the cluster with the kcli version you have installed?
Hmm, so I can create the cluster and the test still failed, so I need to look into it more. Unfortunately I didn't disable teardown and I'm having trouble creating the kcli cluster now, so I've had to spin up a new environment
Can you create the cluster with the kcli version you have installed?
So to start with the ubuntu 22.04 cluster seems correct:
ubuntu@peer-pods-worker-0:~$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
but it still has containerd 1.6:
ubuntu@peer-pods-worker-0:~$ containerd --version
containerd containerd.io 1.6.24 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
I'm guessing that my libvirt testing on rc1 was before the operator change that removed containerd from always being installed and that's why it worked then
I'm guessing that my libvirt testing on rc1 was before the operator change that removed containerd from always being installed and that's why it worked then
yup. containerd is 1.6 on ubuntu 22.04. We need to set that env.
Issue tracker for the CAA release as part of CoCo v0.8.0
v0.8.0-alpha.1