operator-framework / operator-sdk

SDK for building Kubernetes applications. Provides high level APIs, useful abstractions, and project scaffolding.
https://sdk.operatorframework.io
Apache License 2.0
7.19k stars 1.74k forks source link

Cannot deploy the memcached bundle following the quickstart #4587

Closed antonlisovenko closed 3 years ago

antonlisovenko commented 3 years ago

Bug Report

What did you do?

Followed the quickstart instructions:

operator-sdk olm install

operator-sdk init --domain example.com --repo github.com/example/memcached-operator
operator-sdk create api --group cache --version v1alpha1 --kind Memcached --resource --controller
export OPERATOR_IMG="docker.io/antonlisovenko/memcached-operator:v0.0.1"
make docker-build docker-push IMG=$OPERATOR_IMG
make bundle IMG=$OPERATOR_IMG
export BUNDLE_IMG="docker.io/antonlisovenko/memcached-operator-bundle:v0.0.1"
make bundle-build BUNDLE_IMG=$BUNDLE_IMG
make docker-push IMG=$BUNDLE_IMG
operator-sdk run bundle $BUNDLE_IMG

What did you expect to see?

the bundle to get installed

What did you see instead? Under which circumstances?

I see the output error:

FATA[0120] Failed to run bundle: create catalog: error creating registry pod: error creating registry pod: registry pod did not become ready: error waiting for registry pod docker-io-antonlisovenko-memcached-operator-bundle-v0-0-1 to run: timed out waiting for the condition

Pod is not ready:

➜  kubectl get pod docker-io-antonlisovenko-memcached-operator-bundle-v0-0-1 
NAME                                                        READY   STATUS    RESTARTS   AGE
docker-io-antonlisovenko-memcached-operator-bundle-v0-0-1   0/1     Pending   0          25m

And its status doesn't show much:

status:
  phase: Pending
  qosClass: BestEffort

Environment

Operator type:

Kubernetes cluster type:

"vanilla" (Kind)

$ operator-sdk version

operator-sdk version: "v1.4.2", commit: "4b083393be65589358b3e0416573df04f4ae8d9b", kubernetes version: "v1.19.4", go version: "go1.15.8", GOOS: "darwin", GOARCH: "amd64"

$ go version (if language is Go)

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-14T07:30:52Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

Possible Solution

Additional context

estroz commented 3 years ago

@antonlisovenko this is likely being caused by your operator image not being present in the cluster registry (and possibly your bundle image). Try running

kind load docker-image "$OPERATOR_IMG"
kind load docker-image "$BUNDLE_IMG"
operator-sdk run bundle "$BUNDLE_IMG"

/language go /triage support

antonlisovenko commented 3 years ago

hi @estroz

Not sure if that's the issue as both images are pushed/pulled from dockerhub.

I tried your suggestion (had to remove the pod docker-io-antonlisovenko-memcached-operator-bundle-v0-0-1 and the catalogsources.operators.coreos.com as operator-sdk run bundle complained about them) but still have the same problems with Pending pod...

Could you recommend further actions to diagnose the problem - maybe some logs will be helpful somewhere?

estroz commented 3 years ago

Try

kubectl describe pod docker-io-antonlisovenko-memcached-operator-bundle-v0-0-1
antonlisovenko commented 3 years ago
> kubectl describe pod docker-io-antonlisovenko-memcached-operator-bundle-v0-0-1
Name:         docker-io-antonlisovenko-memcached-operator-bundle-v0-0-1
Namespace:    mongodb-atlas-system
Priority:     0
Node:         <none>
Labels:       <none>
Annotations:  <none>
Status:       Pending
IP:
IPs:          <none>
Containers:
  registry-grpc:
    Image:      quay.io/operator-framework/upstream-opm-builder:latest
    Port:       50051/TCP
    Host Port:  0/TCP
    Command:
      /bin/sh
      -c
      /bin/mkdir -p /database && \
      /bin/opm registry add -d /database/index.db -b docker.io/antonlisovenko/memcached-operator-bundle:v0.0.1 --mode=semver && \
      /bin/opm registry serve -d /database/index.db -p 50051

    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vbvvr (ro)
Volumes:
  default-token-vbvvr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vbvvr
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>
antonlisovenko commented 3 years ago

Ok, found the issue. The Kind cluster was created with local registry: https://kind.sigs.k8s.io/docs/user/local-registry/ that somehow affected the work.. This is strange because other pods referencing images on the Internet used to work fine...

Creating kind cluster by simple kind create cluster instead fixed the problem.

Thanks for your help @estroz