ansible / awx-operator

An Ansible AWX operator for Kubernetes built with Operator SDK and Ansible. 🤖
https://www.github.com/ansible/awx
Apache License 2.0
1.26k stars 632 forks source link

ErrImagePull Error/Unable to retrieve some image pull secrets (redhat-operators-pull-secret) #1751

Open njohnsn opened 8 months ago

njohnsn commented 8 months ago

Please confirm the following

Bug Summary

Trying to deploy awx-operator 2.12.2 and getting "Error: ImagePullBackOff" Events. Reason is "Unable to retrieve some image pull secrets (redhat-operators-pull-secret);"

AWX Operator version

2.12.2

AWX version

Not getting that far to determine

Kubernetes platform

other (please specify in additional information)

Kubernetes/Platform version

v1.28.7+k3s1

Modifications

no

Steps to reproduce

make deploy

Expected results

awx pods in running state

Actual results

nmjoo@awx-ext:~/awx-operator$ kubectl get pods -n awx
NAME                                               READY   STATUS             RESTARTS   AGE
awx-operator-controller-manager-589cdd869b-v4tjk   1/2     ImagePullBackOff   0          19m

nmjoo@awx-ext:~/awx-operator$ kubectl describe pod awx-operator-controller-manager-589cdd869b-v4tjk -n awx

  Warning  Failed                           9m48s (x3 over 10m)  kubelet            Error: ErrImagePull
  Normal   BackOff                          9m35s (x3 over 10m)  kubelet            Back-off pulling image "quay.io/ansible/awx-operator:2.12.2"
  Warning  Failed                           9m35s (x3 over 10m)  kubelet            Error: ImagePullBackOff
  Warning  FailedToRetrieveImagePullSecret  39s (x47 over 10m)   kubelet            Unable to retrieve some image pull secrets (redhat-operators-pull-secret); attempting to pull the image may not succeed.

Additional information

Client Version: v1.28.7+k3s1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.7+k3s1

Operator Logs

nmjoo@awx-ext:~/awx-operator$ kubectl get events --field-selector involvedObject.name=awx-operator-controller-manager-589cdd869b-v4tjk -n awx
LAST SEEN   TYPE      REASON                            OBJECT                                                 MESSAGE
24m         Normal    Scheduled                         pod/awx-operator-controller-manager-589cdd869b-v4tjk   Successfully assigned awx/awx-operator-controller-manager-589cdd869b-v4tjk to awx-ext
24m         Normal    Pulled                            pod/awx-operator-controller-manager-589cdd869b-v4tjk   Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.15.0" already present on machine
24m         Normal    Created                           pod/awx-operator-controller-manager-589cdd869b-v4tjk   Created container kube-rbac-proxy
24m         Normal    Started                           pod/awx-operator-controller-manager-589cdd869b-v4tjk   Started container kube-rbac-proxy
24m         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-v4tjk   Failed to pull image "quay.io/ansible/awx-operator:2.12.2": failed to pull and unpack image "quay.io/ansible/awx-operator:2.12.2": failed to extract layer sha256:86426b9e591db2cdd8eba8085aa38b705422152b49960696308d988f33f3d741: failed to unmount /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount1724821168: failed to unmount target /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount1724821168: device or resource busy: unknown
24m         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-v4tjk   Failed to pull image "quay.io/ansible/awx-operator:2.12.2": failed to pull and unpack image "quay.io/ansible/awx-operator:2.12.2": failed to extract layer sha256:86426b9e591db2cdd8eba8085aa38b705422152b49960696308d988f33f3d741: failed to unmount /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount3361607561: failed to unmount target /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount3361607561: device or resource busy: unknown
23m         Normal    Pulling                           pod/awx-operator-controller-manager-589cdd869b-v4tjk   Pulling image "quay.io/ansible/awx-operator:2.12.2"
23m         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-v4tjk   Failed to pull image "quay.io/ansible/awx-operator:2.12.2": failed to pull and unpack image "quay.io/ansible/awx-operator:2.12.2": failed to extract layer sha256:86426b9e591db2cdd8eba8085aa38b705422152b49960696308d988f33f3d741: failed to unmount /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount305204241: failed to unmount target /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount305204241: device or resource busy: unknown
23m         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-v4tjk   Error: ErrImagePull
23m         Normal    BackOff                           pod/awx-operator-controller-manager-589cdd869b-v4tjk   Back-off pulling image "quay.io/ansible/awx-operator:2.12.2"
23m         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-v4tjk   Error: ImagePullBackOff
4m21s       Warning   FailedToRetrieveImagePullSecret   pod/awx-operator-controller-manager-589cdd869b-v4tjk   Unable to retrieve some image pull secrets (redhat-operators-pull-secret); attempting to pull the image may not succeed.
nmjoo@awx-ext:~/awx-operator$ 
njohnsn commented 8 months ago

Here is a log after I completed deleted and reinstalled k3s

nmjoo@awx-ext:~/awx-operator$ kubectl get events --field-selector involvedObject.name=awx-operator-controller-manager-589cdd869b-hzlv7 -n awx
LAST SEEN   TYPE      REASON                            OBJECT                                                 MESSAGE
79s         Normal    Scheduled                         pod/awx-operator-controller-manager-589cdd869b-hzlv7   Successfully assigned awx/awx-operator-controller-manager-589cdd869b-hzlv7 to awx-ext
79s         Normal    Pulling                           pod/awx-operator-controller-manager-589cdd869b-hzlv7   Pulling image "gcr.io/kubebuilder/kube-rbac-proxy:v0.15.0"
77s         Normal    Pulled                            pod/awx-operator-controller-manager-589cdd869b-hzlv7   Successfully pulled image "gcr.io/kubebuilder/kube-rbac-proxy:v0.15.0" in 1.856s (1.856s including waiting)
77s         Normal    Created                           pod/awx-operator-controller-manager-589cdd869b-hzlv7   Created container kube-rbac-proxy
77s         Normal    Started                           pod/awx-operator-controller-manager-589cdd869b-hzlv7   Started container kube-rbac-proxy
70s         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-hzlv7   Failed to pull image "quay.io/ansible/awx-operator:2.12.2": failed to pull and unpack image "quay.io/ansible/awx-operator:2.12.2": failed to extract layer sha256:86426b9e591db2cdd8eba8085aa38b705422152b49960696308d988f33f3d741: failed to unmount /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount3984500326: failed to unmount target /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount3984500326: device or resource busy: unknown
51s         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-hzlv7   Failed to pull image "quay.io/ansible/awx-operator:2.12.2": failed to pull and unpack image "quay.io/ansible/awx-operator:2.12.2": failed to extract layer sha256:86426b9e591db2cdd8eba8085aa38b705422152b49960696308d988f33f3d741: failed to unmount /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount2900992326: failed to unmount target /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount2900992326: device or resource busy: unknown
24s         Normal    Pulling                           pod/awx-operator-controller-manager-589cdd869b-hzlv7   Pulling image "quay.io/ansible/awx-operator:2.12.2"
17s         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-hzlv7   Failed to pull image "quay.io/ansible/awx-operator:2.12.2": failed to pull and unpack image "quay.io/ansible/awx-operator:2.12.2": failed to extract layer sha256:86426b9e591db2cdd8eba8085aa38b705422152b49960696308d988f33f3d741: failed to unmount /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount3255831239: failed to unmount target /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount3255831239: device or resource busy: unknown
17s         Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-hzlv7   Error: ErrImagePull
5s          Warning   FailedToRetrieveImagePullSecret   pod/awx-operator-controller-manager-589cdd869b-hzlv7   Unable to retrieve some image pull secrets (redhat-operators-pull-secret); attempting to pull the image may not succeed.
5s          Normal    BackOff                           pod/awx-operator-controller-manager-589cdd869b-hzlv7   Back-off pulling image "quay.io/ansible/awx-operator:2.12.2"
5s          Warning   Failed                            pod/awx-operator-controller-manager-589cdd869b-hzlv7   Error: ImagePullBackOff
nmjoo@awx-ext:~/awx-operator$ 
TheRealHaoLiu commented 8 months ago

not cause by image pull secret see

Failed to pull image "quay.io/ansible/awx-operator:2.12.2": failed to pull and unpack image "quay.io/ansible/awx-operator:2.12.2": failed to extract layer sha256:86426b9e591db2cdd8eba8085aa38b705422152b49960696308d988f33f3d741: failed to unmount /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount3255831239: failed to unmount target /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount3255831239: device or resource busy: unknown
device or resource busy: unknown

this is likely cause by issue related to your k8s cluster's storage

kmf commented 8 months ago

Same here

valkiriaaquatica commented 2 months ago

Hey @kmf @njohnsn , as its shown here: https://github.com/ansible/awx-operator/issues/922 and as @kurokobo mentioned, I just created a dummy secret

kubectl -n awx create secret docker-registry redhat-operators-pull-secret \
  --docker-server=dummy.example.com \
  --docker-username=dummy \
  --docker-password=dummy

Then, delete the pod to force it to restart and works fine