operator-framework / operator-lifecycle-manager

A management framework for extending Kubernetes with Operators
https://olm.operatorframework.io
Apache License 2.0
1.72k stars 545 forks source link

Reference to "cp" command in bundle_unpacker.go has wrong arch #2769

Open devops-42 opened 2 years ago

devops-42 commented 2 years ago

ARCH: arm64 DISTRO: microk8s

Hi,

when deploying a new operator in k8s a job template is rendered containing initContainers. The first one issues a cp command:

         command:
            - /bin/cp
            - '-Rv'
            - /bin/cpb
            - /util/cpb

While the tooling itself is on the correct CPU architecture, the cp command is still on amd64. This breaks the operator installation. The affected file is most likely:

https://github.com/operator-framework/operator-lifecycle-manager/blob/5a7f8033dfc04150d1c21ef4e86fd7bc00bbfa39/pkg/controller/bundle/bundle_unpacker.go#L138

Any chance either to fix this? As a workaround the first initContainer could be dropped, the second one needs to be slightly modified:

        - name: pull
          image: >-
            quay.io/operatorhubio/grafana-operator@sha256:85a6c64273b3b4fe65e4937b300bf9fddd24a7f1ece7ff9b7de925c77669fc17
          command:
            - /bin/cpb
            - /bundle
          resources:
            requests:
              cpu: 10m
              memory: 50Mi
          volumeMounts:
            - name: bundle
              mountPath: /bundle
            - name: util
              mountPath: /util

Thanks for your help.

exdx commented 2 years ago

Hi @devops-42, thank you for bringing this up -- OLM supports multiple archs so we definitely treat this as a bug. It seems possible that the underlying utilImage that the init container is built on (that contains the cp binary) is not multi-arch? Although I assume it would be, considering OLM is used on different platforms successfully.

We will need to triage this further and come back with a proposed solution.

kbrighton commented 2 years ago

So, I was just testing a similar issue that I was having, and it's still happening with 0.22.0. From the looks of it, it may be with the busybox install that the arm64 olm operator itself uses, even on the arm64 version. I attempted to exec into a running OLM operator pod using this command: kubectl exec -i -t -n olm olm-operator-5984b4c9d7-j6rs8 -c olm-operator -- sh -c "clear; (bash || ash || sh)"

And this was the error message I recieved:

error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "a1550e636fbc0f81f82d340cdca71095273d24c8 ea77b5ff6c631b286ef44a28": OCI runtime exec failed: exec failed: unable to start container process: exec /busybox/sh: exec format error: unknown So, it very well could be that the olm binary works correctly on arm64, which is great, but the underlying busybox just needs to be replaced with an arm64 compiled version.