Closed Gl1TcH-1n-Th3-M4tR1x closed 2 years ago
Looks like a quay issue??
kubelet Failed to pull image "[quay.io/bitnami/nginx:latest](http://quay.io/bitnami/nginx:latest)": rpc error: code = Unknown desc = pinging container registry [quay.io](http://quay.io/): Get "https://quay.io/v2/": dial tcp [54.144.203.57:443](http://54.144.203.57:443/): connect: no route to host
Yeah, it looks like the sampleapp Pod can't reach Quay .... but strange enough okd-cluster-provision can... any tips for troubleshooting this scenario?
Yeah, it looks like the sampleapp Pod can't reach Quay .... but strange enough okd-cluster-provision can... any tips for troubleshooting this scenario?
yeah, that is mostly because for some things we pull from docker (provisioning) and for the infra steps we pull from quay (and the sample app), maybe we could converge to pull everything from the same source.
hummm, it looks more like a routing issue, the worker can't reach quay to pull the image from.
kubelet Failed to pull image "quay.io/bitnami/nginx:latest": rpc error: code = Unknown desc = pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp 34.225.41.113:443: connect: no route to host
After login into worker1:
[core@worker1 ~]$ ssh test@8.8.8.8
ssh: connect to host 8.8.8.8 port 22: No route to host
[core@worker1 ~]$ ssh -p 443 quay.io
ssh: connect to host quay.io port 443: No route to host
@Gl1TcH-1n-Th3-M4tR1x I have been experiencing similar issues across all the distros... This PR fixed the things for the CI https://github.com/Kubeinit/kubeinit/pull/655 Did you manage to pass that problem?
@Gl1TcH-1n-Th3-M4tR1x there was a major breakage because of podman versions not consistent across the different components that are deployed, after #666 I didn't reproduce this anymore.
I can successfully deploy the cluster now.
Describe the bug After running a deployment by using the container image, it fails during the sampleapp validation
To Reproduce Steps to reproduce the behavior:
TASK [kubeinit.kubeinit.kubeinit_apps : Wait until pods are running] ** task path: /root/.ansible/collections/ansible_collections/kubeinit/kubeinit/roles/kubeinit_apps/tasks/sampleapp.yml:48 Using module file /usr/lib/python3.9/site-packages/ansible/modules/command.py Pipelining is enabled. <10.0.0.253> ESTABLISH SSH CONNECTION FOR USER: root <10.0.0.253> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=accept-new -i '~/.ssh/okdcluster_id_rsa' -o 'ProxyCommand=ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=accept-new -i ~/.ssh/okdcluster_id_rsa -W %h:%p -q root@nyctea' -o ControlPath=/root/.ansible/cp/2a631e4199 10.0.0.253 '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"'' <10.0.0.253> (1, b'\n{"changed": true, "stdout": "", "stderr": "", "rc": 1, "cmd": "set -o pipefail\nkubectl get pods --namespace=sampleapp | grep Running\n", "start": "2022-04-01 14:55:34.527107", "end": "2022-04-01 14:55:34.597990", "delta": "0:00:00.070883", "failed": true, "msg": "non-zero return code", "invocation": {"module_args": {"executable": "/bin/bash", "_raw_params": "set -o pipefail\nkubectl get pods --namespace=sampleapp | grep Running\n", "_uses_shell": true, "warn": false, "stdin_add_newline": true, "strip_empty_ends": true, "argv": null, "chdir": null, "creates": null, "removes": null, "stdin": null}}}\n', b'') <10.0.0.253> Failed to connect to the host via ssh: FAILED - RETRYING: [localhost -> service]: Wait until pods are running (60 retries left).Result was: { "attempts": 1, "changed": false, "cmd": "set -o pipefail\nkubectl get pods --namespace=sampleapp | grep Running\n", "delta": "0:00:00.070883", "end": "2022-04-01 14:55:34.597990", "invocation": { "module_args": { "_raw_params": "set -o pipefail\nkubectl get pods --namespace=sampleapp | grep Running\n", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": "/bin/bash", "removes": null, "stdin": null, "stdin_add_newline": true, "strip_empty_ends": true, "warn": false } }, "msg": "non-zero return code", "rc": 1, "retries": 61, "start": "2022-04-01 14:55:34.527107", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []
< 59 attempts later> fatal: [localhost -> service(10.0.0.253)]: FAILED! => { "attempts": 60, "changed": false, "cmd": "set -o pipefail\nkubectl get pods --namespace=sampleapp | grep Running\n", "delta": "0:00:00.071734", "end": "2022-04-01 15:00:52.670565", "invocation": { "module_args": { "_raw_params": "set -o pipefail\nkubectl get pods --namespace=sampleapp | grep Running\n", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": "/bin/bash", "removes": null, "stdin": null, "stdin_add_newline": true, "strip_empty_ends": true, "warn": false } }, "msg": "non-zero return code", "rc": 1, "start": "2022-04-01 15:00:52.598831", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": [] }
oc get deployments -n sampleapp NAME READY UP-TO-DATE AVAILABLE AGE sampleapp 0/4 4 0 3h32m
sh-4.4# oc describe pod sampleapp-6684887657-vqnfw -n sampleapp Name: sampleapp-6684887657-vqnfw Namespace: sampleapp Priority: 0 Node: worker1/10.0.0.3 Start Time: Fri, 01 Apr 2022 18:30:23 +0000 Labels: app=sampleapp pod-template-hash=6684887657 Annotations: k8s.v1.cni.cncf.io/network-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.102.0.9" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.102.0.9" ], "default": true, "dns": {} }] openshift.io/scc: restricted Status: Pending IP: 10.102.0.9 IPs: IP: 10.102.0.9 Controlled By: ReplicaSet/sampleapp-6684887657 Containers: nginx: Container ID:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nll5f (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-nll5f:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional:
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Image: quay.io/bitnami/nginx:latest Image ID:
Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Environment:
Normal Scheduled 8m27s default-scheduler Successfully assigned sampleapp/sampleapp-6684887657-vqnfw to worker1 Normal AddedInterface 8m25s multus Add eth0 [10.102.0.9/23] from openshift-sdn Warning Failed 7m24s kubelet Failed to pull image "quay.io/bitnami/nginx:latest": rpc error: code = Unknown desc = pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp 34.225.41.113:443: connect: no route to host Normal Pulling 6m34s (x4 over 8m25s) kubelet Pulling image "quay.io/bitnami/nginx:latest" Warning Failed 6m28s (x3 over 8m19s) kubelet Failed to pull image "quay.io/bitnami/nginx:latest": rpc error: code = Unknown desc = pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp 54.144.203.57:443: connect: no route to host Warning Failed 6m28s (x4 over 8m19s) kubelet Error: ErrImagePull Warning Failed 6m15s (x6 over 8m18s) kubelet Error: ImagePullBackOff Normal BackOff 3m20s (x18 over 8m18s) kubelet Back-off pulling image "quay.io/bitnami/nginx:latest"
odman run --rm -it -v ~/.ssh/id_rsa:/root/.ssh/id_rsa:z -v ~/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub:z -v ~/.ssh/config:/root/.ssh/cong:z -v ./kubeinit/inventory:/kubeinit/kubeinit/inventory quay.io/kubeinit/kubeinit:2.0.1 -vvv --user root -e kubeinit_spec=okd-libvirt-1-3-1 -i ./kubeinit/inventory ./kubeinit/playbook.yml
Additional context Add any other context about the problem here.