osism / issues

This repository is used for bug reports that are cross-project or not bound to a specific repository (or to an unknown repository).
https://www.osism.tech
1 stars 1 forks source link

[bug] unable to deploy K8s #1067

Open scoopex opened 3 days ago

scoopex commented 3 days ago

OSISM release version

latest

What's the problem?

While deploying Kubernetes with release 7.0.5, the deployment process blocks with a loop waiting for metal-lb.

$ osism apply kubernetes

ok: [st01-ctl-r01-u27] => (item=st01-ctl-r01-u29)

TASK [k3s_server_post : Wait for MetalLB resources] ****************************

STILL ALIVE [task 'k3s_server_post : Wait for MetalLB resources' is running] ***

STILL ALIVE [task 'k3s_server_post : Wait for MetalLB resources' is running] ***

STILL ALIVE [task 'k3s_server_post : Wait for MetalLB resources' is running] ***

STILL ALIVE [task 'k3s_server_post : Wait for MetalLB resources' is running] ***
failed: [st01-ctl-r01-u27] (item=controller) => {"ansible_loop_var": "item", "changed": false, "cmd": ["k3s", "kubectl", "wait", "deployment", "--namespace=metallb-system", "controller", "--for", "condition=Available=True", "--timeout=240s"], "delta": "0:04:00.195189", "end": "2024-06-28 06:15:39.406837", "item": {"condition": "--for condition=Available=True", "description": "controller", "name": "controller", "resource": "deployment"}, "msg": "non-zero return code", "rc": 1, "start": "2024-06-28 06:11:39.211648", "stderr": "error: timed out waiting for the condition on deployments/controller", "stderr_lines": ["error: timed out waiting for the condition on deployments/controller"], "stdout": "", "stdout_lines": []}

References to existing reports

References to existing bug reports, mailing lists, ...

Severity

medium

Urgency

low

scoopex commented 3 days ago

I investigated the situation and came to the following problem:

root@st01-ctl-r01-u27:~# k3s kubectl get pod --namespace=metallb-system --selector=component=controller
NAME                          READY   STATUS              RESTARTS   AGE
controller-5f56cd6f78-9kfkp   0/1     ContainerCreating   0          9d

root@st01-ctl-r01-u27:~# k3s kubectl describe pod --namespace=metallb-system controller-5f56cd6f78-9kfkp
Name:             controller-5f56cd6f78-9kfkp
Namespace:        metallb-system
Priority:         0
Service Account:  controller
Node:             st01-ctl-r01-u27/10.10.21.12
Start Time:       Tue, 18 Jun 2024 12:10:54 +0000
Labels:           app=metallb
                  component=controller
                  pod-template-hash=5f56cd6f78
Annotations:      prometheus.io/port: 7472
                  prometheus.io/scrape: true
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/controller-5f56cd6f78
Containers:
  controller:
    Container ID:  
    Image:         quay.io/metallb/controller:v0.14.3
    Image ID:      
    Ports:         7472/TCP, 9443/TCP
    Host Ports:    0/TCP, 0/TCP
    Args:
      --port=7472
      --log-level=info
      --tls-min-version=VersionTLS12
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:monitoring/metrics delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:monitoring/metrics delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      METALLB_ML_SECRET_NAME:  memberlist
      METALLB_DEPLOYMENT:      controller
    Mounts:
      /tmp/k8s-webhook-server/serving-certs from cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-psh4t (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  webhook-server-cert
    Optional:    false
  kube-api-access-psh4t:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From     Message
  ----     ------                  ----                  ----     -------
  Warning  FailedCreatePodSandBox  65m (x57 over 13h)    kubelet  Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc02:2640:1b90:cea6:b6b5]:443: i/o timeout
  Warning  FailedCreatePodSandBox  50m (x326 over 14h)   kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc01:20a3:9c3e:d4a7:9fb]:443: i/o timeout
  Warning  FailedCreatePodSandBox  45m (x62 over 14h)    kubelet  Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc01:20a3:9c3e:d4a7:9fb]:443: i/o timeout
  Warning  FailedCreatePodSandBox  35m (x332 over 14h)   kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc00:41e1:f57f:e2e2:5e54]:443: i/o timeout
  Warning  FailedCreatePodSandBox  30m (x325 over 14h)   kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc02:2640:1b90:cea6:b6b5]:443: i/o timeout
  Warning  FailedCreatePodSandBox  12m (x4 over 23m)     kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc01:20a3:9c3e:d4a7:9fb]:443: i/o timeout
  Warning  FailedCreatePodSandBox  12m                   kubelet  Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc01:20a3:9c3e:d4a7:9fb]:443: i/o timeout
  Warning  FailedCreatePodSandBox  11m                   kubelet  Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc02:2640:1b90:cea6:b6b5]:443: i/o timeout
  Warning  FailedCreatePodSandBox  8m23s (x12 over 25m)  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc00:41e1:f57f:e2e2:5e54]:443: i/o timeout
  Warning  FailedCreatePodSandBox  7m42s (x3 over 24m)   kubelet  Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc00:41e1:f57f:e2e2:5e54]:443: i/o timeout
  Warning  FailedCreatePodSandBox  33s (x10 over 24m)    kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": failed to do request: Head "https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/3.6": dial tcp [2600:1f18:2148:bc02:2640:1b90:cea6:b6b5]:443: i/o timeout

It seems that k3s does not use the specified proxy settings.

scoopex commented 3 days ago

https://github.com/k3s-io/k3s/pull/3553 may be useful