red-hat-storage / ocs-ci

https://ocs-ci.readthedocs.io/en/latest/
MIT License
109 stars 166 forks source link

Cannot create optional-operators operator in ppc64le #3854

Closed gitsridhar closed 2 years ago

gitsridhar commented 3 years ago

Tried to deploy OCS 4.7 in OCP 4.7 using ocs-ci ocs deployment path. And got this error: 14:51:25 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-marketplace get packagemanifest -n openshift-marketplace --selector=catalog=optional-operators -o yaml 14:51:26 - MainThread - ocs_ci.deployment.deployment - ERROR - Requested packageManifest: local-storage-operator with selector: catalog=optional-operators not found!

[test@nx123-ahv logs]$ oc get pods -n openshift-marketplace NAME READY STATUS RESTARTS AGE certified-operators-mqn47 1/1 Running 0 19m community-operators-57x45 1/1 Running 0 18m marketplace-operator-5bd654c6fd-rlb8b 1/1 Running 0 19m optional-operators-8wvzq 0/1 ImagePullBackOff 0 18m redhat-marketplace-4sbpj 1/1 Running 0 18m redhat-operators-dhjdt 1/1 Running 0 18m [test@nx123-ahv logs]$

oc describe pod optional-operators-8wvzq -n openshift-marketplace: Events: Type Reason Age From Message


Normal Scheduled 18m default-scheduler Successfully assigned openshift-marketplace/optional-operators-8wvzq to worker-0 Normal AddedInterface 18m multus Add eth0 [10.129.2.20/23] Normal Pulling 17m (x4 over 18m) kubelet Pulling image "quay.io/openshift-qe-optional-operators/ocp4-index:latest" Warning Failed 17m (x4 over 18m) kubelet Failed to pull image "quay.io/openshift-qe-optional-operators/ocp4-index:latest": rpc error: code = Unknown desc = Error reading manifest latest in quay.io/openshift-qe-optional-operators/ocp4-index: unauthorized: access to the requested resource is not authorized Warning Failed 17m (x4 over 18m) kubelet Error: ErrImagePull Warning Failed 16m (x7 over 18m) kubelet Error: ImagePullBackOff Normal BackOff 3m28s (x66 over 18m) kubelet Back-off pulling image "quay.io/openshift-qe-optional-operators/ocp4-index:latest"

We are using the secret for quay.io/rhceph-dev provided to us by the OCS team. I guess we need one for quay.io/openshift-qe-optional-operators as well.

vasukulkarni commented 3 years ago

@clacroix12 @b-ranto are we missing it for ppc?

clacroix12 commented 3 years ago

@gitsridhar does your pull secret include an auth section for brew.registry.redhat.io? If not then you will need to update your pull-secret again. The optional operators catalogsource was changed in 4.7 and requires an additional section to be added to your pull secret.

gitsridhar commented 3 years ago

@clacroix12 I emailed and got a token for brew.registry.redhat.io, We are attempting OCS 4.7 deploy on OCP 4.7 with it.

gitsridhar commented 3 years ago

We tried and the optional operator pod did not come up in ppc64le environment:

(venv) [root@ocp47-327a-bastion-0 ocs-ci]# oc get pods -n openshift-marketplace NAME READY STATUS RESTARTS AGE certified-operators-rz65f 1/1 Running 0 15m community-operators-8lcwk 1/1 Running 0 15m marketplace-operator-5c9fbdc468-xd6cc 1/1 Running 0 22m optional-operators-wj6pn 0/1 Error 8 16m redhat-marketplace-xmv89 1/1 Running 0 15m redhat-operators-76d4f 1/1 Running 0 16m Events: Type Reason Age From Message


Normal Scheduled 17m default-scheduler Successfully assigned openshift-marketplace/optional-operators-wj6pn to worker-0 Normal AddedInterface 17m multus Add eth0 [10.131.0.23/23] Normal Pulled 16m kubelet Successfully pulled image "quay.io/openshift-qe-optional-operators/ocp4-index:latest" in 6.166276676s Normal Pulled 16m kubelet Successfully pulled image "quay.io/openshift-qe-optional-operators/ocp4-index:latest" in 2.976482166s Normal Pulled 16m kubelet Successfully pulled image "quay.io/openshift-qe-optional-operators/ocp4-index:latest" in 3.008807809s Normal Pulling 16m (x4 over 17m) kubelet Pulling image "quay.io/openshift-qe-optional-operators/ocp4-index:latest" Normal Pulled 16m kubelet Successfully pulled image "quay.io/openshift-qe-optional-operators/ocp4-index:latest" in 3.083899181s Normal Created 16m (x4 over 16m) kubelet Created container registry-server Normal Started 16m (x4 over 16m) kubelet Started container registry-server Warning BackOff 113s (x73 over 16m) kubelet Back-off restarting failed container (venv) [root@ocp47-327a-bastion-0 ocs-ci]# (venv) [root@ocp47-327a-bastion-0 ocs-ci]# oc logs -f pod/optional-operators-wj6pn -n openshift-marketplace standard_init_linux.go:219: exec user process caused: exec format error

clacroix12 commented 3 years ago

@gitsridhar it looks like the updated pull secret is working as you didn't hit issues pulling the image.

(venv) [root@ocp47-327a-bastion-0 ocs-ci]# oc logs -f pod/optional-operators-wj6pn -n openshift-marketplace standard_init_linux.go:219: exec user process caused: exec format error

Do you have any more detail around the actual issue the pod is seeing? I'm not really sure what the problem is based on this.

If the cluster is still up would you please email me a link to the kubeconfig so I can investigate? I'd like to see what process is causing the pod to not come up properly.

gitsridhar commented 3 years ago

oc logs from the pod shows 'exec format error', means this is not a multi-arch image based pod. Sounds like the pod image is x86 specific only.

vasukulkarni commented 3 years ago

@b-ranto @andrewschoen FYI

b-ranto commented 3 years ago

It looks like some images in the optional qe operators are not built for IBM P or Z. An example is the one hitting the error:

https://quay.io/repository/openshift-qe-optional-operators/ocp4-index-cpaas?tab=tags

It seems to be a single arch only. I am not sure who builds these images but we should ping them about providing multi-arch support where it is missing.

gitsridhar commented 3 years ago

@b-ranto Thanks for your clarification. I will explore who can provide multi-arch images. Or we will wait for 4.7 to GA and proceed from there. Thanks.

zmc commented 3 years ago

From the ART team: OCP QE owns that org. If nobody else has pinged them by Monday, I can.

gitsridhar commented 3 years ago

@zmc Can we bring this back, we have a similar situation with 4.8 as well, can we fix this so we do not have this problem for 4.9?

gitsridhar commented 2 years ago

We hit the same issue in 4.9 as well. Can this be fixed now for 4.9 ?

aditidukle commented 2 years ago

@zmc Could you please tell me how I can contact OCP QE? Also, I need access to - https://quay.io/repository/openshift-qe-optional-operators/ocp4-index-cpaas?tab=tags . I get "no repo found" message when trying to access the link

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs.

github-actions[bot] commented 2 years ago

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.