Add e2e tests for docker

bpradipt commented 1 month ago

Also makes docker network and podvm image configurable to help with e2e, and some minor fixes

bpradipt commented 1 week ago

Properties file used

# Docker configs
CLUSTER_NAME="peer-pods"
DOCKER_HOST="unix:///var/run/docker.sock"
DOCKER_PODVM_IMAGE="quay.io/bpradipt/podvm-docker-image"
DOCKER_NETWORK_NAME="kind"
CAA_IMAGE="quay.io/bpradipt/cloud-api-adaptor"
CAA_IMAGE_TAG="latest"

# KBS configs
KBS_IMAGE=""
KBS_IMAGE_TAG=""

Test results

ubuntu@test-pp:~/cloud-api-adaptor/src/cloud-api-adaptor$ make TEST_PODVM_IMAGE=quay.io/bpradipt/podvm-docker-image TEST_PROVISION=yes CLOUD_PROVIDER=docker TEST_PROVISION_FILE=$(pwd)/docker/provision_docker.properties test-e2e
go test -v -tags=docker -timeout 60m -count=1 ./test/e2e
time="2024-06-20T10:53:18Z" level=info msg="Do setup"
time="2024-06-20T10:53:18Z" level=info msg="Cluster provisioning"
Docker is already installed
Check if kind is already installed
kind is already installed
Check if the cluster peer-pods already exists
Cluster peer-pods already exists
Adding worker label to nodes belonging to: peer-pods
time="2024-06-20T10:53:18Z" level=info msg="Install Cloud API Adaptor"
time="2024-06-20T10:53:18Z" level=info msg="Deploy the Cloud API Adaptor"
time="2024-06-20T10:53:18Z" level=info msg="Install the controller manager"
Wait for the cc-operator-controller-manager deployment be available
time="2024-06-20T10:53:27Z" level=info msg="Customize the overlay yaml file"
time="2024-06-20T10:53:27Z" level=info msg="Updating CAA image with \"quay.io/bpradipt/cloud-api-adaptor\""
time="2024-06-20T10:53:27Z" level=info msg="Updating CAA image tag with \"latest\""
time="2024-06-20T10:53:29Z" level=info msg="Install the cloud-api-adaptor"
Wait for the cc-operator-daemon-install DaemonSet be available
Wait for the pod cc-operator-daemon-install-pczvk be ready
Wait for the cloud-api-adaptor-daemonset DaemonSet be available
Wait for the pod cloud-api-adaptor-daemonset-f8t2j be ready
Wait for the kata-remote runtimeclass be created
time="2024-06-20T10:53:54Z" level=info msg="Installing peerpod-ctrl"
time="2024-06-20T10:53:57Z" level=info msg="Wait for the peerpod-ctrl deployment to be available"
time="2024-06-20T10:54:02Z" level=info msg="Creating namespace 'coco-pp-e2e-test-55e360fb'..."
time="2024-06-20T10:54:02Z" level=info msg="Wait for namespace 'coco-pp-e2e-test-55e360fb' be ready..."
time="2024-06-20T10:54:07Z" level=info msg="Wait for default serviceaccount in namespace 'coco-pp-e2e-test-55e360fb'..."
time="2024-06-20T10:54:07Z" level=info msg="default serviceAccount exists, namespace 'coco-pp-e2e-test-55e360fb' is ready for use"
=== RUN   TestDockerCreateSimplePod
=== RUN   TestDockerCreateSimplePod/SimplePeerPod_test
    assessment_runner.go:265: Waiting for containers in pod: simple-test are ready
=== RUN   TestDockerCreateSimplePod/SimplePeerPod_test/PodVM_is_created
    assessment_helpers.go:175: Pulled with nydus-snapshotter driver:2024/06/20 10:54:10 [adaptor/proxy]         mount_point:/run/kata-containers/d7d8472979abe9faa2a9c844ab9131af3377491083a81cc17a03050fbafbde7c/rootfs source:quay.io/prometheus/busybox:latest fstype:overlay driver:image_guest_pull
time="2024-06-20T10:54:17Z" level=info msg="Deleting pod simple-test..."
time="2024-06-20T10:54:22Z" level=info msg="Pod simple-test has been successfully deleted within 60s"
--- PASS: TestDockerCreateSimplePod (15.12s)
    --- PASS: TestDockerCreateSimplePod/SimplePeerPod_test (15.12s)
        --- PASS: TestDockerCreateSimplePod/SimplePeerPod_test/PodVM_is_created (0.06s)
=== RUN   TestDockerCreatePodWithConfigMap
=== RUN   TestDockerCreatePodWithConfigMap/ConfigMapPeerPod_test
    assessment_runner.go:265: Waiting for containers in pod: busybox-configmap-pod are ready
=== RUN   TestDockerCreatePodWithConfigMap/ConfigMapPeerPod_test/Configmap_is_created_and_contains_data
    assessment_runner.go:415: Output when execute test commands:
time="2024-06-20T10:54:37Z" level=info msg="Deleting Configmap... busybox-configmap"
time="2024-06-20T10:54:37Z" level=info msg="Deleting pod busybox-configmap-pod..."
time="2024-06-20T10:54:42Z" level=info msg="Pod busybox-configmap-pod has been successfully deleted within 60s"
--- PASS: TestDockerCreatePodWithConfigMap (20.17s)
    --- PASS: TestDockerCreatePodWithConfigMap/ConfigMapPeerPod_test (20.17s)
        --- PASS: TestDockerCreatePodWithConfigMap/ConfigMapPeerPod_test/Configmap_is_created_and_contains_data (5.11s)
=== RUN   TestDockerCreatePodWithSecret
=== RUN   TestDockerCreatePodWithSecret/SecretPeerPod_test
    assessment_runner.go:265: Waiting for containers in pod: busybox-secret-pod are ready
=== RUN   TestDockerCreatePodWithSecret/SecretPeerPod_test/Secret_has_been_created_and_contains_data
    assessment_runner.go:415: Output when execute test commands:
time="2024-06-20T10:55:02Z" level=info msg="Deleting Secret... busybox-secret"
time="2024-06-20T10:55:02Z" level=info msg="Deleting pod busybox-secret-pod..."
time="2024-06-20T10:55:07Z" level=info msg="Pod busybox-secret-pod has been successfully deleted within 60s"
--- PASS: TestDockerCreatePodWithSecret (25.66s)
    --- PASS: TestDockerCreatePodWithSecret/SecretPeerPod_test (25.66s)
        --- PASS: TestDockerCreatePodWithSecret/SecretPeerPod_test/Secret_has_been_created_and_contains_data (5.13s)
=== RUN   TestDockerCreatePeerPodContainerWithExternalIPAccess
=== RUN   TestDockerCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test
    assessment_runner.go:265: Waiting for containers in pod: busybox are ready
=== RUN   TestDockerCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test/Peer_Pod_Container_Connected_to_External_IP
    assessment_runner.go:415: Output when execute test commands:
time="2024-06-20T10:55:28Z" level=info msg="Deleting pod busybox..."
time="2024-06-20T10:55:33Z" level=info msg="Pod busybox has been successfully deleted within 60s"
--- PASS: TestDockerCreatePeerPodContainerWithExternalIPAccess (25.22s)
    --- PASS: TestDockerCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test (25.22s)
        --- PASS: TestDockerCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test/Peer_Pod_Container_Connected_to_External_IP (5.17s)
=== RUN   TestDockerCreatePeerPodWithJob
=== RUN   TestDockerCreatePeerPodWithJob/JobPeerPod_test
=== RUN   TestDockerCreatePeerPodWithJob/JobPeerPod_test/Job_has_been_created
    assessment_helpers.go:291: SUCCESS: job-pi-c4kqx - Completed - LOG: 3.14156
time="2024-06-20T10:55:43Z" level=info msg="Output Log from Pod: 3.14156"
time="2024-06-20T10:55:43Z" level=info msg="Deleting Job... job-pi"
time="2024-06-20T10:55:43Z" level=info msg="Deleting pods created by job... job-pi-c4kqx"
--- PASS: TestDockerCreatePeerPodWithJob (10.07s)
    --- PASS: TestDockerCreatePeerPodWithJob/JobPeerPod_test (10.07s)
        --- PASS: TestDockerCreatePeerPodWithJob/JobPeerPod_test/Job_has_been_created (0.02s)
=== RUN   TestDockerCreatePeerPodAndCheckUserLogs
    common_suite.go:161: Skipping Test until issue kata-containers/kata-containers#5732 is Fixed
--- SKIP: TestDockerCreatePeerPodAndCheckUserLogs (0.00s)
=== RUN   TestDockerCreatePeerPodAndCheckWorkDirLogs
=== RUN   TestDockerCreatePeerPodAndCheckWorkDirLogs/WorkDirPeerPod_test
=== RUN   TestDockerCreatePeerPodAndCheckWorkDirLogs/WorkDirPeerPod_test/Peer_pod_with_work_directory_has_been_created
    assessment_runner.go:362: Log output of peer pod:/other
time="2024-06-20T10:58:18Z" level=info msg="Deleting pod workdirpod..."
time="2024-06-20T10:58:23Z" level=info msg="Pod workdirpod has been successfully deleted within 60s"
--- PASS: TestDockerCreatePeerPodAndCheckWorkDirLogs (160.06s)
    --- PASS: TestDockerCreatePeerPodAndCheckWorkDirLogs/WorkDirPeerPod_test (160.06s)
        --- PASS: TestDockerCreatePeerPodAndCheckWorkDirLogs/WorkDirPeerPod_test/Peer_pod_with_work_directory_has_been_created (5.03s)
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageOnly
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageOnly/EnvVariablePeerPodWithImageOnly_test
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageOnly/EnvVariablePeerPodWithImageOnly_test/Peer_pod_with_environmental_variables_has_been_created
    assessment_runner.go:362: Log output of peer pod:KUBERNETES_SERVICE_PORT=443
        KUBERNETES_PORT=tcp://10.96.0.1:443
        HOSTNAME=env-variable-in-image
        SHLVL=1
        HOME=/root
        KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
        PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        KUBERNETES_PORT_443_TCP_PORT=443
        KUBERNETES_PORT_443_TCP_PROTO=tcp
        KUBERNETES_SERVICE_PORT_HTTPS=443
        KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
        ISPRODUCTION=false
        KUBERNETES_SERVICE_HOST=10.96.0.1
        PWD=/
time="2024-06-20T10:58:38Z" level=info msg="Deleting pod env-variable-in-image..."
time="2024-06-20T10:58:43Z" level=info msg="Pod env-variable-in-image has been successfully deleted within 60s"
--- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageOnly (20.06s)
    --- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageOnly/EnvVariablePeerPodWithImageOnly_test (20.06s)
        --- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageOnly/EnvVariablePeerPodWithImageOnly_test/Peer_pod_with_environmental_variables_has_been_created (5.02s)
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly/EnvVariablePeerPodWithDeploymentOnly_test
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly/EnvVariablePeerPodWithDeploymentOnly_test/Peer_pod_with_environmental_variables_has_been_created
    assessment_runner.go:362: Log output of peer pod:KUBERNETES_SERVICE_PORT=443
        KUBERNETES_PORT=tcp://10.96.0.1:443
        HOSTNAME=env-variable-in-config
        SHLVL=1
        HOME=/root
        KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
        PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        KUBERNETES_PORT_443_TCP_PORT=443
        KUBERNETES_PORT_443_TCP_PROTO=tcp
        KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
        KUBERNETES_SERVICE_PORT_HTTPS=443
        ISPRODUCTION=true
        KUBERNETES_SERVICE_HOST=10.96.0.1
        PWD=/
time="2024-06-20T10:58:58Z" level=info msg="Deleting pod env-variable-in-config..."
time="2024-06-20T10:59:03Z" level=info msg="Pod env-variable-in-config has been successfully deleted within 60s"
--- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly (20.20s)
    --- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly/EnvVariablePeerPodWithDeploymentOnly_test (20.20s)
        --- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly/EnvVariablePeerPodWithDeploymentOnly_test/Peer_pod_with_environmental_variables_has_been_created (5.02s)
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment/EnvVariablePeerPodWithBoth_test
=== RUN   TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment/EnvVariablePeerPodWithBoth_test/Peer_pod_with_environmental_variables_has_been_created
    assessment_runner.go:362: Log output of peer pod:KUBERNETES_SERVICE_PORT=443
        KUBERNETES_PORT=tcp://10.96.0.1:443
        HOSTNAME=env-variable-in-both
        SHLVL=1
        HOME=/root
        KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
        PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        KUBERNETES_PORT_443_TCP_PORT=443
        KUBERNETES_PORT_443_TCP_PROTO=tcp
        KUBERNETES_SERVICE_PORT_HTTPS=443
        KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
        ISPRODUCTION=true
        KUBERNETES_SERVICE_HOST=10.96.0.1
        PWD=/
time="2024-06-20T10:59:18Z" level=info msg="Deleting pod env-variable-in-both..."
time="2024-06-20T10:59:23Z" level=info msg="Pod env-variable-in-both has been successfully deleted within 60s"
--- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment (20.08s)
    --- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment/EnvVariablePeerPodWithBoth_test (20.08s)
        --- PASS: TestDockerCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment/EnvVariablePeerPodWithBoth_test/Peer_pod_with_environmental_variables_has_been_created (5.02s)
=== RUN   TestDockerCreateNginxDeployment
=== RUN   TestDockerCreateNginxDeployment/Nginx_image_deployment_test
time="2024-06-20T10:59:23Z" level=info msg="Creating nginx deployment..."
time="2024-06-20T10:59:28Z" level=info msg="Current deployment available replicas: 0"
time="2024-06-20T11:05:18Z" level=info msg="nginx deployment is available now"
=== RUN   TestDockerCreateNginxDeployment/Nginx_image_deployment_test/Access_for_nginx_deployment_test
time="2024-06-20T11:05:18Z" level=info msg="Deleting webserver deployment..."
time="2024-06-20T11:05:18Z" level=info msg="Deleting deployment nginx-deployment..."
time="2024-06-20T11:05:23Z" level=info msg="Deployment nginx-deployment has been successfully deleted within 120s"
--- PASS: TestDockerCreateNginxDeployment (360.05s)
    --- PASS: TestDockerCreateNginxDeployment/Nginx_image_deployment_test (360.05s)
        --- PASS: TestDockerCreateNginxDeployment/Nginx_image_deployment_test/Access_for_nginx_deployment_test (0.01s)
=== RUN   TestDockerDeletePod
=== RUN   TestDockerDeletePod/DeletePod_test
    assessment_runner.go:265: Waiting for containers in pod: deletion-test are ready
=== RUN   TestDockerDeletePod/DeletePod_test/Deletion_complete
time="2024-06-20T11:05:33Z" level=info msg="Deleting pod deletion-test..."
time="2024-06-20T11:05:38Z" level=info msg="Pod deletion-test has been successfully deleted within 60s"
--- PASS: TestDockerDeletePod (15.05s)
    --- PASS: TestDockerDeletePod/DeletePod_test (15.05s)
        --- PASS: TestDockerDeletePod/DeletePod_test/Deletion_complete (0.01s)
=== RUN   TestDockerPodToServiceCommunication
=== RUN   TestDockerPodToServiceCommunication/TestExtraPods_test
    assessment_runner.go:265: Waiting for containers in pod: nginx are ready
time="2024-06-20T11:06:03Z" level=info msg="webserver service is available on cluster IP: 10.96.44.88"
Provision extra pod busybox    assessment_helpers.go:425: Waiting for containers in pod: busybox are ready
=== RUN   TestDockerPodToServiceCommunication/TestExtraPods_test/Failed_to_test_extra_pod.
time="2024-06-20T11:06:19Z" level=info msg="Success to access nginx service. <!DOCTYPE html>\n<html>\n<head>\n<title>Welcome to nginx!</title>\n<style>\nhtml { color-scheme: light dark; }\nbody { width: 35em; margin: 0 auto;\nfont-family: Tahoma, Verdana, Arial, sans-serif; }\n</style>\n</head>\n<body>\n<h1>Welcome to nginx!</h1>\n<p>If you see this page, the nginx web server is successfully installed and\nworking. Further configuration is required.</p>\n\n<p>For online documentation and support please refer to\n<a href=\"http://nginx.org/\">nginx.org</a>.<br/>\nCommercial support is available at\n<a href=\"http://nginx.com/\">nginx.com</a>.</p>\n\n<p><em>Thank you for using nginx.</em></p>\n</body>\n</html>\n"
    assessment_runner.go:516: Output when execute test commands:<!DOCTYPE html>
        <html>
        <head>
        <title>Welcome to nginx!</title>
        <style>
        html { color-scheme: light dark; }
        body { width: 35em; margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif; }
        </style>
        </head>
        <body>
        <h1>Welcome to nginx!</h1>
        <p>If you see this page, the nginx web server is successfully installed and
        working. Further configuration is required.</p>

        <p>For online documentation and support please refer to
        <a href="http://nginx.org/">nginx.org</a>.<br/>
        Commercial support is available at
        <a href="http://nginx.com/">nginx.com</a>.</p>

        <p><em>Thank you for using nginx.</em></p>
        </body>
        </html>
time="2024-06-20T11:06:19Z" level=info msg="Deleting pod nginx..."
time="2024-06-20T11:06:24Z" level=info msg="Pod nginx has been successfully deleted within 60s"
time="2024-06-20T11:06:24Z" level=info msg="Deleting pod busybox..."
time="2024-06-20T11:06:29Z" level=info msg="Pod busybox has been successfully deleted within 60s"
time="2024-06-20T11:06:29Z" level=info msg="Deleting Service... nginx"
--- PASS: TestDockerPodToServiceCommunication (50.32s)
    --- PASS: TestDockerPodToServiceCommunication/TestExtraPods_test (50.32s)
        --- PASS: TestDockerPodToServiceCommunication/TestExtraPods_test/Failed_to_test_extra_pod. (5.13s)
=== RUN   TestDockerPodsMTLSCommunication
=== RUN   TestDockerPodsMTLSCommunication/TestPodsMTLSCommunication_test
    assessment_runner.go:265: Waiting for containers in pod: nginx are ready
time="2024-06-20T11:06:49Z" level=info msg="webserver service is available on cluster IP: 10.96.53.33"
Provision extra pod curl
 assessment_helpers.go:425: Waiting for containers in pod: curl are ready
=== RUN   TestDockerPodsMTLSCommunication/TestPodsMTLSCommunication_test/Pods_communication_with_mTLS
time="2024-06-20T11:08:29Z" level=info msg="Success to access nginx service. <!DOCTYPE html>\n<html>\n<head>\n<title>Welcome to nginx!</title>\n<style>\nhtml { color-scheme: light dark; }\nbody { width: 35em; margin: 0 auto;\nfont-family: Tahoma, Verdana, Arial, sans-serif; }\n</style>\n</head>\n<body>\n<h1>Welcome to nginx!</h1>\n<p>If you see this page, the nginx web server is successfully installed and\nworking. Further configuration is required.</p>\n\n<p>For online documentation and support please refer to\n<a href=\"http://nginx.org/\">nginx.org</a>.<br/>\nCommercial support is available at\n<a href=\"http://nginx.com/\">nginx.com</a>.</p>\n\n<p><em>Thank you for using nginx.</em></p>\n</body>\n</html>\n"
    assessment_runner.go:516: Output when execute test commands:<!DOCTYPE html>
        <html>
        <head>
        <title>Welcome to nginx!</title>
        <style>
        html { color-scheme: light dark; }
        body { width: 35em; margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif; }
        </style>
        </head>
        <body>
        <h1>Welcome to nginx!</h1>
        <p>If you see this page, the nginx web server is successfully installed and
        working. Further configuration is required.</p>

        <p>For online documentation and support please refer to
        <a href="http://nginx.org/">nginx.org</a>.<br/>
        Commercial support is available at
        <a href="http://nginx.com/">nginx.com</a>.</p>

        <p><em>Thank you for using nginx.</em></p>
        </body>
        </html>
time="2024-06-20T11:08:29Z" level=info msg="Deleting Configmap... nginx-conf"
time="2024-06-20T11:08:29Z" level=info msg="Deleting Secret... server-certs"
time="2024-06-20T11:08:29Z" level=info msg="Deleting extra Secret... curl-certs"
time="2024-06-20T11:08:29Z" level=info msg="Deleting pod nginx..."
time="2024-06-20T11:08:34Z" level=info msg="Pod nginx has been successfully deleted within 60s"
time="2024-06-20T11:08:34Z" level=info msg="Deleting pod curl..."
time="2024-06-20T11:08:39Z" level=info msg="Pod curl has been successfully deleted within 60s"
time="2024-06-20T11:08:39Z" level=info msg="Deleting Service... nginx"
--- PASS: TestDockerPodsMTLSCommunication (130.38s)
    --- PASS: TestDockerPodsMTLSCommunication/TestPodsMTLSCommunication_test (130.38s)
        --- PASS: TestDockerPodsMTLSCommunication/TestPodsMTLSCommunication_test/Pods_communication_with_mTLS (5.17s)
=== RUN   TestDockerKbsKeyRelease
    docker_test.go:102: Skipping kbs related test as kbs is not deployed
--- SKIP: TestDockerKbsKeyRelease (0.00s)
PASS
time="2024-06-20T11:08:39Z" level=info msg="Deleting namespace 'coco-pp-e2e-test-55e360fb'..."

time="2024-06-20T11:08:49Z" level=info msg="Namespace 'coco-pp-e2e-test-55e360fb' has been successfully deleted within 60s"
Deleting the kind cluster
Deleting cluster "kind" ...
Uninstalling kind
Uninstalling Docker
Reading package lists...
Building dependency tree...
Reading state information...
The following packages were automatically installed and are no longer required:
  conntrack cri-tools ebtables kubernetes-cni libltdl7 libslirp0 pigz
  slirp4netns socat
Use 'sudo apt autoremove' to remove them.
The following packages will be REMOVED:
  containerd.io* docker-buildx-plugin* docker-ce* docker-ce-cli*
  docker-ce-rootless-extras* docker-compose-plugin*
0 upgraded, 0 newly installed, 6 to remove and 17 not upgraded.
After this operation, 434 MB disk space will be freed.
(Reading database ... 98637 files and directories currently installed.)
Removing docker-ce (5:26.1.3-1~ubuntu.22.04~jammy) ...
Removing containerd.io (1.6.32-1) ...
Removing docker-buildx-plugin (0.14.0-1~ubuntu.22.04~jammy) ...
Removing docker-ce-cli (5:26.1.3-1~ubuntu.22.04~jammy) ...
Removing docker-ce-rootless-extras (5:26.1.3-1~ubuntu.22.04~jammy) ...
Removing docker-compose-plugin (2.27.0-1~ubuntu.22.04~jammy) ...
Processing triggers for man-db (2.10.2-1) ...
(Reading database ... 98402 files and directories currently installed.)
Purging configuration files for docker-ce (5:26.1.3-1~ubuntu.22.04~jammy) ...
Purging configuration files for containerd.io (1.6.32-1) ...
time="2024-06-20T11:09:11Z" level=info msg="Delete the Cloud API Adaptor installation"
time="2024-06-20T11:09:11Z" level=info msg="Uninstall the cloud-api-adaptor"
ok      github.com/confidential-containers/cloud-api-adaptor/src/cloud-api-adaptor/test/e2e 953.256s

bpradipt commented 1 week ago

I ran the e2e tests on a Ubuntu 22.04 VM with 8GB RAM and 4 vCPUs

stevenhorsman commented 1 week ago

@bpradipt - I'm trying to run the e2e test and my nodes are not ready, so the install just hangs. Describing it I see:

container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

Should the kind_cluster script have sorted that, or do you know if there is some manual pre-req I've missed?

bpradipt commented 1 week ago

@bpradipt - I'm trying to run the e2e test and my nodes are not ready, so the install just hangs. Describing it I see:
container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Should the kind_cluster script have sorted that, or do you know if there is some manual pre-req I've missed?

The kind installation script should have taken care of it. Are you trying on an existing system or new system? Any other details on the environment to help understand what's happening ? Also can you check the o/p of kubectl get pods -A and verify if calico pods are up or not.

stevenhorsman commented 1 week ago

The kind installation script should have taken care of it. Are you trying on an existing system or new system? Any other details on the environment to help understand what's happening ?

It is a brand new VM and I pick Ubuntu 22.04 with 4 vCPUs and 8GB RAM to match your tested set-up.

It doesn't look like calico/flannel have been installed:

# kubectl get pods -A
NAMESPACE                        NAME                                              READY   STATUS    RESTARTS   AGE
confidential-containers-system   cc-operator-controller-manager-546574cf87-5m427   0/2     Pending   0          47m
kube-system                      coredns-5d78c9869d-c6n6f                          0/1     Pending   0          49m
kube-system                      coredns-5d78c9869d-pn4ts                          0/1     Pending   0          49m
kube-system                      etcd-peer-pods-control-plane                      1/1     Running   0          49m
kube-system                      kube-apiserver-peer-pods-control-plane            1/1     Running   0          49m
kube-system                      kube-controller-manager-peer-pods-control-plane   1/1     Running   0          49m
kube-system                      kube-proxy-ltvgv                                  1/1     Running   0          49m
kube-system                      kube-proxy-xffwm                                  1/1     Running   0          48m
kube-system                      kube-scheduler-peer-pods-control-plane            1/1     Running   0          49m
local-path-storage               local-path-provisioner-5b77c697fd-rpfr9           0/1     Pending   0          49m

The pending pods are due to the nodes not being ready:

# kubectl get nodes
NAME                      STATUS     ROLES           AGE   VERSION
peer-pods-control-plane   NotReady   control-plane   50m   v1.27.11
peer-pods-worker          NotReady   worker          50m   v1.27.11

# kubectl describe node peer-pods-worker
Name:               peer-pods-worker
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=peer-pods-worker
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=worker
                    node.kubernetes.io/worker=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Thu, 20 Jun 2024 05:39:31 -0700
Taints:             node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  peer-pods-worker
  AcquireTime:     <unset>
  RenewTime:       Thu, 20 Jun 2024 06:30:04 -0700
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Thu, 20 Jun 2024 06:25:39 -0700   Thu, 20 Jun 2024 05:39:31 -0700   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 20 Jun 2024 06:25:39 -0700   Thu, 20 Jun 2024 05:39:31 -0700   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 20 Jun 2024 06:25:39 -0700   Thu, 20 Jun 2024 05:39:31 -0700   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Thu, 20 Jun 2024 06:25:39 -0700   Thu, 20 Jun 2024 05:39:31 -0700   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Addresses:
  InternalIP:  172.18.0.3
  Hostname:    peer-pods-worker
Capacity:
  cpu:                4
  ephemeral-storage:  259915780Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             8127940Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  259915780Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             8127940Ki
  pods:               110
System Info:
  Machine ID:                 4e3be70594c240ddac4b967a2cc4500f
  System UUID:                8984a844-205f-48e7-926b-48247e972ffc
  Boot ID:                    f08205e5-ca48-4bc2-94da-8b31b73bf4f4
  Kernel Version:             5.15.0-107-generic
  OS Image:                   Debian GNU/Linux 12 (bookworm)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.7.13
  Kubelet Version:            v1.27.11
  Kube-Proxy Version:         v1.27.11
PodCIDR:                      192.168.1.0/24
PodCIDRs:                     192.168.1.0/24
ProviderID:                   kind://docker/peer-pods/peer-pods-worker
Non-terminated Pods:          (1 in total)
  Namespace                   Name                CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                ------------  ----------  ---------------  -------------  ---
  kube-system                 kube-proxy-xffwm    0 (0%)        0 (0%)      0 (0%)           0 (0%)         50m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests  Limits
  --------           --------  ------
  cpu                0 (0%)    0 (0%)
  memory             0 (0%)    0 (0%)
  ephemeral-storage  0 (0%)    0 (0%)
  hugepages-1Gi      0 (0%)    0 (0%)
  hugepages-2Mi      0 (0%)    0 (0%)
Events:
  Type    Reason                   Age                From             Message
  ----    ------                   ----               ----             -------
  Normal  Starting                 50m                kube-proxy
  Normal  Starting                 50m                kubelet          Starting kubelet.
  Normal  NodeHasSufficientMemory  50m (x2 over 50m)  kubelet          Node peer-pods-worker status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    50m (x2 over 50m)  kubelet          Node peer-pods-worker status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     50m (x2 over 50m)  kubelet          Node peer-pods-worker status is now: NodeHasSufficientPID
  Normal  NodeAllocatableEnforced  50m                kubelet          Updated Node Allocatable limit across pods
  Normal  RegisteredNode           50m                node-controller  Node peer-pods-worker event: Registered Node peer-pods-worker in Controller

Is there any other info that might be helpful, or things I can try? Sorry I appreciate this is more of a kind issue than anything else, but I don't have much experience using it and want to try and test the e2e set-up.

bpradipt commented 1 week ago

The kind installation script should have taken care of it. Are you trying on an existing system or new system? Any other details on the environment to help understand what's happening ?

It is a brand new VM and I pick Ubuntu 22.04 with 4 vCPUs and 8GB RAM to match your tested set-up.

It doesn't look like calico/flannel have been installed:

# kubectl get pods -A
NAMESPACE                        NAME                                              READY   STATUS    RESTARTS   AGE
confidential-containers-system   cc-operator-controller-manager-546574cf87-5m427   0/2     Pending   0          47m
kube-system                      coredns-5d78c9869d-c6n6f                          0/1     Pending   0          49m
kube-system                      coredns-5d78c9869d-pn4ts                          0/1     Pending   0          49m
kube-system                      etcd-peer-pods-control-plane                      1/1     Running   0          49m
kube-system                      kube-apiserver-peer-pods-control-plane            1/1     Running   0          49m
kube-system                      kube-controller-manager-peer-pods-control-plane   1/1     Running   0          49m
kube-system                      kube-proxy-ltvgv                                  1/1     Running   0          49m
kube-system                      kube-proxy-xffwm                                  1/1     Running   0          48m
kube-system                      kube-scheduler-peer-pods-control-plane            1/1     Running   0          49m
local-path-storage               local-path-provisioner-5b77c697fd-rpfr9           0/1     Pending   0          49m

For some reason calico is not installed. The following line from the kind_cluster.sh script

...
 # Deploy calico
    kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/calico.yaml || exit 1

The pending pods are due to the nodes not being ready:
# kubectl get nodes
NAME                      STATUS     ROLES           AGE   VERSION
peer-pods-control-plane   NotReady   control-plane   50m   v1.27.11
peer-pods-worker          NotReady   worker          50m   v1.27.11
kubectl describe node peer-pods-worker

Name: peer-pods-worker Roles: worker Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=peer-pods-worker kubernetes.io/os=linux node-role.kubernetes.io/worker=worker node.kubernetes.io/worker= Annotations: kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Thu, 20 Jun 2024 05:39:31 -0700 Taints: node.kubernetes.io/not-ready:NoSchedule Unschedulable: false Lease: HolderIdentity: peer-pods-worker AcquireTime: RenewTime: Thu, 20 Jun 2024 06:30:04 -0700 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message

MemoryPressure False Thu, 20 Jun 2024 06:25:39 -0700 Thu, 20 Jun 2024 05:39:31 -0700 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Thu, 20 Jun 2024 06:25:39 -0700 Thu, 20 Jun 2024 05:39:31 -0700 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Thu, 20 Jun 2024 06:25:39 -0700 Thu, 20 Jun 2024 05:39:31 -0700 KubeletHasSufficientPID kubelet has sufficient PID available Ready False Thu, 20 Jun 2024 06:25:39 -0700 Thu, 20 Jun 2024 05:39:31 -0700 KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized Addresses: InternalIP: 172.18.0.3 Hostname: peer-pods-worker Capacity: cpu: 4 ephemeral-storage: 259915780Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8127940Ki pods: 110 Allocatable: cpu: 4 ephemeral-storage: 259915780Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8127940Ki pods: 110 System Info: Machine ID: 4e3be70594c240ddac4b967a2cc4500f System UUID: 8984a844-205f-48e7-926b-48247e972ffc Boot ID: f08205e5-ca48-4bc2-94da-8b31b73bf4f4 Kernel Version: 5.15.0-107-generic OS Image: Debian GNU/Linux 12 (bookworm) Operating System: linux Architecture: amd64 Container Runtime Version: containerd://1.7.13 Kubelet Version: v1.27.11 Kube-Proxy Version: v1.27.11 PodCIDR: 192.168.1.0/24 PodCIDRs: 192.168.1.0/24 ProviderID: kind://docker/peer-pods/peer-pods-worker Non-terminated Pods: (1 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age

kube-system kube-proxy-xffwm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 50m Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits

cpu 0 (0%) 0 (0%) memory 0 (0%) 0 (0%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) Events: Type Reason Age From Message

Normal Starting 50m kube-proxy Normal Starting 50m kubelet Starting kubelet. Normal NodeHasSufficientMemory 50m (x2 over 50m) kubelet Node peer-pods-worker status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 50m (x2 over 50m) kubelet Node peer-pods-worker status is now: NodeHasNoDiskPressure Normal NodeHasSufficientPID 50m (x2 over 50m) kubelet Node peer-pods-worker status is now: NodeHasSufficientPID Normal NodeAllocatableEnforced 50m kubelet Updated Node Allocatable limit across pods Normal RegisteredNode 50m node-controller Node peer-pods-worker event: Registered Node peer-pods-worker in Controller
Is there any other info that might be helpful, or things I can try? Sorry I appreciate this is more of a kind issue than anything else, but I don't have much experience using it and want to try and test the e2e set-up.

Nothing that I could think off now. Let me spend some time to figure out what could be causing this issue..

bpradipt commented 1 week ago

@stevenhorsman I added a prereqs.sh helper script in case needed to install the required prerequisites for the tests. Please try following the README https://github.com/confidential-containers/cloud-api-adaptor/blob/d509fccb8fd0ad2549c06b41105ef02776ed3ab6/src/cloud-api-adaptor/docker/README.md#running-the-caa-e2e-tests and see if it helps.

Example properties file to try

# Docker configs
CLUSTER_NAME="peer-pods"
DOCKER_HOST="unix:///var/run/docker.sock"
DOCKER_PODVM_IMAGE="quay.io/bpradipt/podvm-docker-image"
DOCKER_NETWORK_NAME="kind"
CAA_IMAGE="quay.io/bpradipt/cloud-api-adaptor"
CAA_IMAGE_TAG="latest"

# KBS configs
KBS_IMAGE=""
KBS_IMAGE_TAG=""

stevenhorsman commented 1 week ago

@stevenhorsman I added a prereqs.sh helper script in case needed to install the required prerequisites for the tests. Please try following the README https://github.com/confidential-containers/cloud-api-adaptor/blob/d509fccb8fd0ad2549c06b41105ef02776ed3ab6/src/cloud-api-adaptor/docker/README.md#running-the-caa-e2e-tests and see if it helps.

Sure will do. Lots of meetings atm, but will try and get to it by EoD tomorrow

stevenhorsman commented 1 week ago

Sure will do. Lots of meetings atm, but will try and get to it by EoD tomorrow

The pre-reqs script did the trick. I'm not sure why, but the installation worked after using that. I hit the image pull error though :(

bpradipt commented 2 days ago

@stevenhorsman can we move ahead with this PR? The test flakiness are not really related to the provider.

stevenhorsman commented 8 hours ago

@stevenhorsman can we move ahead with this PR? The test flakiness are not really related to the provider.

Good

@stevenhorsman can we move ahead with this PR? The test flakiness are not really related to the provider.

Yeah, I think that is fair, but we as we believe that any users will hit the failure maybe we need to add some "temporary" 🤞 doc about the problem we see and the ctr fetch required to solve it?

bpradipt commented 8 hours ago

@stevenhorsman can we move ahead with this PR? The test flakiness are not really related to the provider.

Good

@stevenhorsman can we move ahead with this PR? The test flakiness are not really related to the provider.

Yeah, I think that is fair, but we as we believe that any users will hit the failure maybe we need to add some "temporary" 🤞 doc about the problem we see and the ctr fetch required to solve it?

Yeah. It should be generic doc imho as it can affect any provider, may be under troubleshooting - https://github.com/confidential-containers/cloud-api-adaptor/tree/main/src/cloud-api-adaptor/docs/troubleshooting ?

stevenhorsman commented 7 hours ago

Yeah. It should be generic doc imho as it can affect any provider, may be under troubleshooting - https://github.com/confidential-containers/cloud-api-adaptor/tree/main/src/cloud-api-adaptor/docs/troubleshooting ?

That's a good idea, we seem to be pretty guaranteed to hit that with the docker provider, so maybe linking to that section for the docker provider docs makes sense too?

bpradipt commented 3 hours ago

Yeah. It should be generic doc imho as it can affect any provider, may be under troubleshooting - https://github.com/confidential-containers/cloud-api-adaptor/tree/main/src/cloud-api-adaptor/docs/troubleshooting ?

That's a good idea, we seem to be pretty guaranteed to hit that with the docker provider, so maybe linking to that section for the docker provider docs makes sense too?

@stevenhorsman done. PTAL

wainersm commented 16 minutes ago

Hi @bpradipt !

I gave it a try in fresh Ubuntu 22.04 (4 vcpus and 8 GB mem). The tests ran but none passed. Wondering if I should use a different podvm image than quay.io/bpradipt/podvm-docker-image and/or rebase the branch with main?

$ make TEST_PODVM_IMAGE=quay.io/bpradipt/podvm-docker-image TEST_PROVISION=yes CLOUD_PROVIDER=docker TEST_PROVISION_FILE=$(pwd)/test/provisioner/docker/provision_docker.properties test-e2e
go: downloading github.com/spf13/cobra v1.7.0

<SNIP>

go: downloading github.com/xlab/treeprint v1.2.0
go: downloading github.com/go-playground/locales v0.14.1
go: downloading github.com/moby/spdystream v0.2.0
go: downloading go.starlark.net v0.0.0-20200306205701-8dd3e2ee1dd5
time="2024-07-01T14:44:54Z" level=info msg="Do setup"
time="2024-07-01T14:44:54Z" level=info msg="Cluster provisioning"
Check if the cluster peer-pods already exists
No kind clusters found.
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 512
Creating a kind cluster
Creating cluster "peer-pods" ...
 • Ensuring node image (kindest/node:v1.27.11) 🖼  ...
 ✓ Ensuring node image (kindest/node:v1.27.11) 🖼
 • Preparing nodes 📦 📦   ...
 ✓ Preparing nodes 📦 📦 
 • Writing configuration 📜  ...
 ✓ Writing configuration 📜
 • Starting control-plane 🕹️  ...
 ✓ Starting control-plane 🕹️
 • Installing StorageClass 💾  ...
 ✓ Installing StorageClass 💾
 • Joining worker nodes 🚜  ...
 ✓ Joining worker nodes 🚜
Set kubectl context to "kind-peer-pods"
You can now use your cluster with:

kubectl cluster-info --context kind-peer-pods

Thanks for using kind! 😊
poddisruptionbudget.policy/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
serviceaccount/calico-node created
serviceaccount/calico-cni-plugin created
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpfilters.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrole.rbac.authorization.k8s.io/calico-cni-plugin created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-cni-plugin created
daemonset.apps/calico-node created
deployment.apps/calico-kube-controllers created
Adding worker label to nodes belonging to: peer-pods
time="2024-07-01T14:46:37Z" level=info msg="Podvm uploading"
Using default tag: latest
latest: Pulling from bpradipt/podvm-docker-image
e2d942cf87b3: Already exists

<SNIP>

9c7825678839: Pull complete
Digest: sha256:4cd6eb142c17b2120fdc5914daae4b889a4a0702e53c3250b1ea4744f3d31d2a
Status: Downloaded newer image for quay.io/bpradipt/podvm-docker-image:latest
quay.io/bpradipt/podvm-docker-image:latest
time="2024-07-01T14:47:12Z" level=info msg="Install Cloud API Adaptor"
time="2024-07-01T14:47:13Z" level=info msg="Deploy the Cloud API Adaptor"
time="2024-07-01T14:47:13Z" level=info msg="Install the controller manager"
Wait for the cc-operator-controller-manager deployment be available
time="2024-07-01T14:48:21Z" level=info msg="Customize the overlay yaml file"
time="2024-07-01T14:48:21Z" level=info msg="Updating CAA image with \"quay.io/bpradipt/cloud-api-adaptor\""
time="2024-07-01T14:48:21Z" level=info msg="Updating CAA image tag with \"latest\""
time="2024-07-01T14:48:24Z" level=info msg="Install the cloud-api-adaptor"
Wait for the cc-operator-daemon-install DaemonSet be available
Wait for the pod cc-operator-daemon-install-tmgtt be ready
Wait for the cloud-api-adaptor-daemonset DaemonSet be available
Wait for the pod cloud-api-adaptor-daemonset-fgbpr be ready
Wait for the kata-remote runtimeclass be created
time="2024-07-01T14:51:24Z" level=info msg="Installing peerpod-ctrl"
time="2024-07-01T14:51:41Z" level=info msg="Wait for the peerpod-ctrl deployment to be available"
time="2024-07-01T14:52:26Z" level=info msg="Creating namespace 'coco-pp-e2e-test-fc6cbe60'..."
time="2024-07-01T14:52:26Z" level=info msg="Wait for namespace 'coco-pp-e2e-test-fc6cbe60' be ready..."
time="2024-07-01T14:52:31Z" level=info msg="Wait for default serviceaccount in namespace 'coco-pp-e2e-test-fc6cbe60'..."
time="2024-07-01T14:52:31Z" level=info msg="default serviceAccount exists, namespace 'coco-pp-e2e-test-fc6cbe60' is ready for use"
=== RUN   TestDockerCreateSimplePod
=== RUN   TestDockerCreateSimplePod/SimplePeerPod_test
    assessment_runner.go:262: timed out waiting for the condition
--- FAIL: TestDockerCreateSimplePod (600.03s)
    --- FAIL: TestDockerCreateSimplePod/SimplePeerPod_test (600.03s)
=== RUN   TestDockerCreatePodWithConfigMap
=== RUN   TestDockerCreatePodWithConfigMap/ConfigMapPeerPod_test
    assessment_runner.go:262: timed out waiting for the condition
--- FAIL: TestDockerCreatePodWithConfigMap (600.03s)
    --- FAIL: TestDockerCreatePodWithConfigMap/ConfigMapPeerPod_test (600.03s)
=== RUN   TestDockerCreatePodWithSecret
=== RUN   TestDockerCreatePodWithSecret/SecretPeerPod_test
    assessment_runner.go:262: timed out waiting for the condition
--- FAIL: TestDockerCreatePodWithSecret (600.04s)
    --- FAIL: TestDockerCreatePodWithSecret/SecretPeerPod_test (600.04s)
=== RUN   TestDockerCreatePeerPodContainerWithExternalIPAccess
=== RUN   TestDockerCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test
    assessment_runner.go:262: timed out waiting for the condition
--- FAIL: TestDockerCreatePeerPodContainerWithExternalIPAccess (600.02s)
    --- FAIL: TestDockerCreatePeerPodContainerWithExternalIPAccess/IPAccessPeerPod_test (600.02s)
=== RUN   TestDockerCreatePeerPodWithJob
=== RUN   TestDockerCreatePeerPodWithJob/JobPeerPod_test
    assessment_runner.go:235: timed out waiting for the condition
=== RUN   TestDockerCreatePeerPodWithJob/JobPeerPod_test/Job_has_been_created
--- FAIL: TestDockerCreatePeerPodWithJob (600.02s)
    --- FAIL: TestDockerCreatePeerPodWithJob/JobPeerPod_test (600.02s)
        --- FAIL: TestDockerCreatePeerPodWithJob/JobPeerPod_test/Job_has_been_created (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x17878b2]

goroutine 732 [running]:
testing.tRunner.func1.2({0x1925980, 0x2d3e300})
        /usr/local/go/src/testing/testing.go:1631 +0x24a
testing.tRunner.func1()
        /usr/local/go/src/testing/testing.go:1634 +0x377
panic({0x1925980?, 0x2d3e300?})
        /usr/local/go/src/runtime/panic.go:770 +0x132
github.com/confidential-containers/cloud-api-adaptor/src/cloud-api-adaptor/test/e2e.GetSuccessfulAndErroredPods({_, _}, _, {_, _}, {{{0x0, 0x0}, {0x0, 0x0}}, {{0xc001dd63e4, ...}, ...}, ...})
        /home/ubuntu/cloud-api-adaptor/src/cloud-api-adaptor/test/e2e/assessment_helpers.go:264 +0x3d2
github.com/confidential-containers/cloud-api-adaptor/src/cloud-api-adaptor/test/e2e.(*TestCase).Run.func2({0x1fe2ee0, 0x2dd7100}, 0xc002e0f1e0, 0xc0004c79d0?)
        /home/ubuntu/cloud-api-adaptor/src/cloud-api-adaptor/test/e2e/assessment_runner.go:320 +0x1da
sigs.k8s.io/e2e-framework/pkg/env.(*testEnv).executeSteps(0xc000127540, {0x1fe2ee0?, 0x2dd7100?}, 0xc002e0f1e0, {0xc0009b7f40?, 0xc0009b7f60?, 0x53f9dc?})
        /home/ubuntu/go/pkg/mod/sigs.k8s.io/e2e-framework@v0.1.0/pkg/env/env.go:421 +0x8b
sigs.k8s.io/e2e-framework/pkg/env.(*testEnv).processTestFeature.(*testEnv).execFeature.func1.1(0xc002e0f1e0)
        /home/ubuntu/go/pkg/mod/sigs.k8s.io/e2e-framework@v0.1.0/pkg/env/env.go:452 +0xaa
testing.tRunner(0xc002e0f1e0, 0xc002609170)
        /usr/local/go/src/testing/testing.go:1689 +0xfb
created by testing.(*T).Run in goroutine 627
        /usr/local/go/src/testing/testing.go:1742 +0x390
FAIL    github.com/confidential-containers/cloud-api-adaptor/src/cloud-api-adaptor/test/e2e     3457.888s
FAIL
make: *** [Makefile:96: test-e2e] Error 1

stevenhorsman commented 13 minutes ago

@wainersm - I think you are falling into the ctr fetch issue? See https://github.com/confidential-containers/cloud-api-adaptor/blob/18d251f6c749f02635522f728465fcd9fd37c577/src/cloud-api-adaptor/docs/troubleshooting/nydus-snapshotter.md for more info And as I commented above it is very annoying, so has taken me over 2 hours to run through all the e2e tests for this!

confidential-containers / cloud-api-adaptor

Add e2e tests for docker #1845

kubectl describe node peer-pods-worker