cnti-testcatalog / testsuite

📞📱☎️📡🌐 Cloud Native Telecom Initiative (CNTI) Test Catalog is a tool to check for and provide feedback on the use of K8s + cloud native best practices in networking applications and platforms
https://wiki.lfnetworking.org/display/LN/Test+Catalog
Apache License 2.0
173 stars 71 forks source link

[BUG] Registry spec tests not passing due to insecure registry #2001

Open svteb opened 5 months ago

svteb commented 5 months ago

Describe the bug This bug is related to spec tests tagged private_registry_rolling and private_registry_version. Due to the spec tests deploying a registry inside the cluster that is insecure the image cannot be pulled.

Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  19s               default-scheduler  Successfully assigned default/coredns-coredns-75cb5bbc47-5mnps to minikube-m03
  Normal   BackOff    17s               kubelet            Back-off pulling image "registry.default.svc.cluster.local:5000/coredns:1.6.7"
  Warning  Failed     17s               kubelet            Error: ImagePullBackOff
  Normal   Pulling    3s (x2 over 18s)  kubelet            Pulling image "registry.default.svc.cluster.local:5000/coredns:1.6.7"
  Warning  Failed     3s (x2 over 18s)  kubelet            Failed to pull image "registry.default.svc.cluster.local:5000/coredns:1.6.7": rpc error: code = Unknown desc = failed to pull and unpack image "registry.default.svc.cluster.local:5000/coredns:1.6.7": failed to resolve reference "registry.default.svc.cluster.local:5000/coredns:1.6.7": failed to do request: Head "https://registry.default.svc.cluster.local:5000/v2/coredns/manifests/1.6.7": http: server gave HTTP response to HTTPS client
  Warning  Failed     3s (x2 over 18s)  kubelet            Error: ErrImagePull

Subsequently all 3 of the tests will fail due to timeouts.

To Reproduce

  1. Create KinD/minikube cluster.
  2. crystal build src/cnf-testsuite.cr
  3. ./cnf-testsuite setup
  4. crystal spec spec/workload/registry_spec.cr:92 <- should point to the line of the first failing test
  5. Wait until dockerd, registry and cluster-tools pods are deployed, afterwards the tested coredns deployment should fail.
  6. kubectl describe pod coredns-coredns-...

Expected behavior The image should get pulled from the registry. The most convenient way to resolve this would likely be to deploy a secure registry/do some hacking to allow the insecure registry (although doing this cluster-agnostic is likely unfeasible).

Device: Linux, Ubuntu server 22.04, x86 fails on both kind and minikube kind version: v0.22.0 minikube version: v1.32.0 kubectl version: v1.23.13

Additional context I ran into some DNS issues (the registry service name being unresolvable) but we have some proxy problems on our side so that is probably unrelated to the actual spec test.

My workaround I made some scripts for minikube that allow the insecure registry to work (might be a bit destructive so use with caution).

This is the script that I call after creating the cluster:

# Name of the configuration script
SCRIPT_NAME="configure-containerd.sh"

# Get a list of all minikube nodes
NODES=$(minikube node list | awk '{print $1}')

# Loop over each node
for NODE in $NODES; do
    echo "Deploying to $NODE..."

    # Copy the script to the current node
    minikube cp ./$SCRIPT_NAME $NODE:/home/docker/$SCRIPT_NAME

    # Add execution rights
    minikube ssh -n $NODE -- "sudo chmod +x /home/docker/$SCRIPT_NAME"

    # Execute the script on the node
    minikube ssh -n $NODE -- "sudo /home/docker/$SCRIPT_NAME"
done

This script is executed on every minikube node (configure-containerd.sh):

#!/bin/bash

# Update etc/resolv.conf <------------------------------ IN CASE OF DNS RESOLUTION ERRORS
# echo "nameserver 10.96.0.10" >> /etc/resolv.conf
# echo "search svc.cluster.local cluster.local" >> /etc/resolv.conf

# Configuration parameters
REGISTRY_IP="registry.default.svc.cluster.local"
REGISTRY_PORT="5000"

# Prepare the endpoint URL
ENDPOINT="http://${REGISTRY_IP}:${REGISTRY_PORT}"

# Step 1: Backup the existing config.toml if it exists
echo "Backing up exiting configuration"
sudo mv /etc/containerd/config.toml $HOME/config.toml.bak
sudo rm -rf /etc/containerd/config.toml

# Step 2: Generate default containerd configuration if not existing
echo "Generating new default configuration"
if [ ! -f /etc/containerd/config.toml ]; then
    sudo mkdir -p /etc/containerd
    sudo containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
fi

# Step 3: Add insecure registry configuration for the registry
echo "Adding insecure registry"
sudo sed -i "/\[plugins.\"io.containerd.grpc.v1.cri\".registry.configs\]/c \\
      [plugins.\"io.containerd.grpc.v1.cri\".registry.configs.\"${REGISTRY_IP}:${REGISTRY_PORT}\"]\\
      [plugins.\"io.containerd.grpc.v1.cri\".registry.configs.\"${REGISTRY_IP}:${REGISTRY_PORT}\".tls]\\
        ca_file = \"\"\\
        cert_file = \"\"\\
        insecure_skip_verify = true\\
        key_file = \"\"" /etc/containerd/config.toml

sudo sed -i "/\[plugins.\"io.containerd.grpc.v1.cri\".registry.mirrors\]/c \\
      [plugins.\"io.containerd.grpc.v1.cri\".registry.mirrors.\"${REGISTRY_IP}:${REGISTRY_PORT}\"]\\
        endpoint = [\"${ENDPOINT}\"]" /etc/containerd/config.toml

# Step 4: Restart containerd to apply changes
echo "Applying containerd changes"
sudo systemctl restart containerd
sudo systemctl enable containerd