operator-framework / operator-lifecycle-manager

A management framework for extending Kubernetes with Operators
https://olm.operatorframework.io
Apache License 2.0
1.69k stars 543 forks source link

CRD can not be created sucessful in installlation of OLM v.0.24.0 version (same bug as #2778: Applying OLM CRDs fails due to last-applied-configuration annotation) #2968

Open shaojini opened 1 year ago

shaojini commented 1 year ago

Bug Report

What did you do? A clear and concise description of the steps you took (or insert a code snippet). root@K8s-master:~# docker login root@K8s-master:~# export olm_release=v0.24.0
root@K8s-master:~# kubectl apply -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/${olm_release}/crds.yaml What did you expect to see? A clear and concise description of what you expected to happen (or insert a code snippet). Successful installation What did you see instead? Under which circumstances? A clear and concise description of what you expected to happen (or insert a code snippet). customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created The CustomResourceDefinition "clusterserviceversions.operators.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

When I switched into "operator-sdk" to install OLM, it did not show any error for the installation of OLM, however, the packageserver service is in "false":

root@K8s-master:~# operator-sdk olm install

INFO[0000] Fetching CRDs for version "latest" INFO[0000] Fetching resources for resolved version "latest" I0519 20:39:28.599145 64106 request.go:690] Waited for 1.01264153s due to client-side throttling, not priority and fairness, request: GET:https://192.168.0.7:6443/apis/cilium.io/v2?timeout=32s INFO[0008] Creating CRDs and resources INFO[0008] Creating CustomResourceDefinition "catalogsources.operators.coreos.com" INFO[0008] Creating CustomResourceDefinition "clusterserviceversions.operators.coreos.com" INFO[0008] Creating CustomResourceDefinition "installplans.operators.coreos.com" INFO[0008] Creating CustomResourceDefinition "olmconfigs.operators.coreos.com" INFO[0008] Creating CustomResourceDefinition "operatorconditions.operators.coreos.com" INFO[0008] Creating CustomResourceDefinition "operatorgroups.operators.coreos.com" INFO[0008] Creating CustomResourceDefinition "operators.operators.coreos.com" INFO[0008] Creating CustomResourceDefinition "subscriptions.operators.coreos.com" INFO[0009] Creating Namespace "olm" INFO[0009] Creating Namespace "operators" INFO[0009] Creating ServiceAccount "olm/olm-operator-serviceaccount" INFO[0009] Creating ClusterRole "system:controller:operator-lifecycle-manager" INFO[0009] Creating ClusterRoleBinding "olm-operator-binding-olm" INFO[0009] Creating OLMConfig "cluster" I0519 20:39:38.648832 64106 request.go:690] Waited for 1.447733128s due to client-side throttling, not priority and fairness, request: GET:https://192.168.0.7:6443/apis/operators.coreos.com/v1alpha2?timeout=32s INFO[0012] Creating Deployment "olm/olm-operator" INFO[0012] Creating Deployment "olm/catalog-operator" INFO[0012] Creating ClusterRole "aggregate-olm-edit" INFO[0012] Creating ClusterRole "aggregate-olm-view" INFO[0012] Creating OperatorGroup "operators/global-operators" INFO[0012] Creating OperatorGroup "olm/olm-operators" INFO[0012] Creating ClusterServiceVersion "olm/packageserver" INFO[0012] Creating CatalogSource "olm/operatorhubio-catalog" INFO[0012] Waiting for deployment/olm-operator rollout to complete INFO[0012] Waiting for Deployment "olm/olm-operator" to rollout: 0 of 1 updated replicas are available INFO[0014] Deployment "olm/olm-operator" successfully rolled out INFO[0014] Waiting for deployment/catalog-operator rollout to complete INFO[0014] Deployment "olm/catalog-operator" successfully rolled out INFO[0014] Waiting for deployment/packageserver rollout to complete INFO[0014] Waiting for Deployment "olm/packageserver" to appear INFO[0015] Waiting for Deployment "olm/packageserver" to rollout: 0 of 2 updated replicas are available INFO[0028] Deployment "olm/packageserver" successfully rolled out INFO[0028] Successfully installed OLM version "latest"

NAME NAMESPACE KIND STATUS catalogsources.operators.coreos.com CustomResourceDefinition Installed clusterserviceversions.operators.coreos.com CustomResourceDefinition Installed installplans.operators.coreos.com CustomResourceDefinition Installed olmconfigs.operators.coreos.com CustomResourceDefinition Installed operatorconditions.operators.coreos.com CustomResourceDefinition Installed operatorgroups.operators.coreos.com CustomResourceDefinition Installed operators.operators.coreos.com CustomResourceDefinition Installed subscriptions.operators.coreos.com CustomResourceDefinition Installed olm Namespace Installed operators Namespace Installed olm-operator-serviceaccount olm ServiceAccount Installed system:controller:operator-lifecycle-manager ClusterRole Installed olm-operator-binding-olm ClusterRoleBinding Installed cluster OLMConfig Installed olm-operator olm Deployment Installed catalog-operator olm Deployment Installed aggregate-olm-edit ClusterRole Installed aggregate-olm-view ClusterRole Installed global-operators operators OperatorGroup Installed olm-operators olm OperatorGroup Installed packageserver olm ClusterServiceVersion Installed operatorhubio-catalog olm CatalogSource Installed

However,

root@K8s-master:~# kubectl get apiservices.apiregistration.k8s.io v1.packages.operators.coreos.com

E0519 21:20:02.374158 88834 memcache.go:287] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:20:02.375416 88834 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:20:02.377921 88834 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:20:02.380392 88834 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:20:02.405270 88834 memcache.go:287] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:20:02.427846 88834 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:20:02.433485 88834 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:20:02.436921 88834 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request NAME SERVICE AVAILABLE AGE v1.packages.operators.coreos.com olm/packageserver-service False (FailedDiscoveryCheck) 40m

The "packages.operators.coreos.com" is not functioning:

root@K8s-master:~# kubectl get crd | grep operators.coreos.com

E0519 21:28:58.773330 94129 memcache.go:287] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:28:58.774865 94129 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:28:58.777895 94129 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request E0519 21:28:58.781929 94129 memcache.go:121] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request catalogsources.operators.coreos.com 2023-05-19T17:39:35Z clusterserviceversions.operators.coreos.com 2023-05-19T17:39:35Z installplans.operators.coreos.com 2023-05-19T17:39:35Z olmconfigs.operators.coreos.com 2023-05-19T17:39:35Z operatorconditions.operators.coreos.com 2023-05-19T17:39:35Z operatorgroups.operators.coreos.com 2023-05-19T17:39:35Z operators.operators.coreos.com 2023-05-19T17:39:35Z subscriptions.operators.coreos.com 2023-05-19T17:39:35Z

Environment

root@K8s-master:~# kubectl version WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version. Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-15T13:40:17Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v4.5.7 Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.4", GitCommit:"f89670c3aa4059d6999cb42e23ccb4f0b9a03979", GitTreeState:"clean", BuildDate:"2023-04-12T12:05:35Z", GoVersion:"go1.19.8", Compiler:"gc", Platform:"linux/amd64"}

Possible Solution

Additional context Add any other context about the problem here.

grokspawn commented 1 year ago

Hi @shaojini, thanks for the issue.
The underlying problem is expressed in the message:

The CustomResourceDefinition "clusterserviceversions.operators.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

which is displayed when the kubectl apply fails. Unfortunately, it leaves the CRDs incompletely installed, and worse it creates the situation where attempting to repeat the install operation won't help.

The underlying condition is the kubectl.kubernetes.io/last-applied-configuration annotation which kubectl creates whenever it performs a kubectl apply operation. This duplicates much of the CRDs annotations and so it makes it impossible to use CRs which are much more than half the max value of 262144 bytes.

The easiest resolution is manual, cleaning up old resources and using kubectl create instead of kubectl apply (which will not create the problematic annotation). Since you are working w/in a kind cluster, the easiest approach is to do

kind delete clusters kind
kind create cluster
export olm_release=v0.24.0
kubectl create -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/${olm_release}/crds.yaml

I'll try to get this bug in the agenda for discussion since we've evidently passed the threshold where users start encountering this and it's not great UX.

Could you update this issue with the source of your instructions that got you started here? It may be that we need to update them as well.

Thanks!

grokspawn commented 1 year ago

Follow-up: I stuck this on the community meeting on 23rd May: agenda.

shaojini commented 1 year ago

Hi @shaojini, thanks for the issue. The underlying problem is expressed in the message:

The CustomResourceDefinition "clusterserviceversions.operators.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

which is displayed when the kubectl apply fails. Unfortunately, it leaves the CRDs incompletely installed, and worse it creates the situation where attempting to repeat the install operation won't help.

The underlying condition is the kubectl.kubernetes.io/last-applied-configuration annotation which kubectl creates whenever it performs a kubectl apply operation. This duplicates much of the CRDs annotations and so it makes it impossible to use CRs which are much more than half the max value of 262144 bytes.

The easiest resolution is manual, cleaning up old resources and using kubectl create instead of kubectl apply (which will not create the problematic annotation). Since you are working w/in a kind cluster, the easiest approach is to do

kind delete clusters kind
kind create cluster
export olm_release=v0.24.0
kubectl create -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/${olm_release}/crds.yaml

I'll try to get this bug in the agenda for discussion since we've evidently passed the threshold where users start encountering this and it's not great UX.

Could you update this issue with the source of your instructions that got you started here? It may be that we need to update them as well.

Thanks!

Thanks @grokspawn reply.

I have removed all resources used OLM, removed the cluster totally, removed old version and reinstalled new version of kubeadm, kubectl, kubelet, created a new cluster.

After installing the "cert-manager" via root@K8s-master:~# kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.11.2/cert-manager.yaml root@K8s-master:~# kubectl --namespace cert-manager wait --for condition=ready pod -l app.kubernetes.io/instance=cert-manager

I have following results according to your instructions:

root@K8s-master:~# export olm_release=v0.24.0 root@K8s-master:~# kubectl create -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/${olm_release}/crds.yaml customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created

root@K8s-master:~# kubectl create -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/${olm_release}/olm.yaml namespace/olm created namespace/operators created serviceaccount/olm-operator-serviceaccount created clusterrole.rbac.authorization.k8s.io/system:controller:operator-lifecycle-manager created clusterrolebinding.rbac.authorization.k8s.io/olm-operator-binding-olm created olmconfig.operators.coreos.com/cluster created deployment.apps/olm-operator created deployment.apps/catalog-operator created clusterrole.rbac.authorization.k8s.io/aggregate-olm-edit created clusterrole.rbac.authorization.k8s.io/aggregate-olm-view created operatorgroup.operators.coreos.com/global-operators created operatorgroup.operators.coreos.com/olm-operators created clusterserviceversion.operators.coreos.com/packageserver created catalogsource.operators.coreos.com/operatorhubio-catalog created

It seems the OLM has been installed successfully, however, it is not:

root@K8s-master:~# kubectl get apiservices.apiregistration.k8s.io v1.packages.operators.coreos.com NAME SERVICE AVAILABLE AGE v1.packages.operators.coreos.com olm/packageserver-service False (FailedDiscoveryCheck) 89s

root@K8s-master:~# kubectl get crds NAME CREATED AT catalogsources.operators.coreos.com 2023-05-22T09:19:42Z certificaterequests.cert-manager.io 2023-05-22T09:17:11Z certificates.cert-manager.io 2023-05-22T09:17:11Z challenges.acme.cert-manager.io 2023-05-22T09:17:11Z ciliumclusterwidenetworkpolicies.cilium.io 2023-05-22T09:02:56Z ciliumendpoints.cilium.io 2023-05-22T09:02:54Z ciliumexternalworkloads.cilium.io 2023-05-22T09:02:54Z ciliumidentities.cilium.io 2023-05-22T09:02:54Z ciliumloadbalancerippools.cilium.io 2023-05-22T09:02:54Z ciliumnetworkpolicies.cilium.io 2023-05-22T09:02:55Z ciliumnodeconfigs.cilium.io 2023-05-22T09:02:54Z ciliumnodes.cilium.io 2023-05-22T09:02:54Z clusterissuers.cert-manager.io 2023-05-22T09:17:11Z clusterserviceversions.operators.coreos.com 2023-05-22T09:19:42Z installplans.operators.coreos.com 2023-05-22T09:19:42Z issuers.cert-manager.io 2023-05-22T09:17:11Z olmconfigs.operators.coreos.com 2023-05-22T09:19:42Z operatorconditions.operators.coreos.com 2023-05-22T09:19:42Z operatorgroups.operators.coreos.com 2023-05-22T09:19:42Z operators.operators.coreos.com 2023-05-22T09:19:42Z orders.acme.cert-manager.io 2023-05-22T09:17:11Z subscriptions.operators.coreos.com 2023-05-22T09:19:42Z traces.gadget.kinvolk.io 2023-05-22T09:11:10Z

The packetserver service is fail in installing. What is wrong? Thanks.

shaojini commented 1 year ago

Hi @shaojini, thanks for the issue. The underlying problem is expressed in the message:

The CustomResourceDefinition "clusterserviceversions.operators.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

which is displayed when the kubectl apply fails. Unfortunately, it leaves the CRDs incompletely installed, and worse it creates the situation where attempting to repeat the install operation won't help. The underlying condition is the kubectl.kubernetes.io/last-applied-configuration annotation which kubectl creates whenever it performs a kubectl apply operation. This duplicates much of the CRDs annotations and so it makes it impossible to use CRs which are much more than half the max value of 262144 bytes. The easiest resolution is manual, cleaning up old resources and using kubectl create instead of kubectl apply (which will not create the problematic annotation). Since you are working w/in a kind cluster, the easiest approach is to do

kind delete clusters kind
kind create cluster
export olm_release=v0.24.0
kubectl create -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/${olm_release}/crds.yaml

I'll try to get this bug in the agenda for discussion since we've evidently passed the threshold where users start encountering this and it's not great UX. Could you update this issue with the source of your instructions that got you started here? It may be that we need to update them as well. Thanks!

Thanks @grokspawn reply.

I have removed all resources used OLM, removed the cluster totally, removed old version and reinstalled new version of kubeadm, kubectl, kubelet, created a new cluster.

After installing the "cert-manager" via root@K8s-master:~# kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.11.2/cert-manager.yaml root@K8s-master:~# kubectl --namespace cert-manager wait --for condition=ready pod -l app.kubernetes.io/instance=cert-manager

I have following results according to your instructions:

root@K8s-master:~# export olm_release=v0.24.0 root@K8s-master:~# kubectl create -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/${olm_release}/crds.yaml customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created

root@K8s-master:~# kubectl create -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/${olm_release}/olm.yaml namespace/olm created namespace/operators created serviceaccount/olm-operator-serviceaccount created clusterrole.rbac.authorization.k8s.io/system:controller:operator-lifecycle-manager created clusterrolebinding.rbac.authorization.k8s.io/olm-operator-binding-olm created olmconfig.operators.coreos.com/cluster created deployment.apps/olm-operator created deployment.apps/catalog-operator created clusterrole.rbac.authorization.k8s.io/aggregate-olm-edit created clusterrole.rbac.authorization.k8s.io/aggregate-olm-view created operatorgroup.operators.coreos.com/global-operators created operatorgroup.operators.coreos.com/olm-operators created clusterserviceversion.operators.coreos.com/packageserver created catalogsource.operators.coreos.com/operatorhubio-catalog created

It seems the OLM has been installed successfully, however, it is not:

root@K8s-master:~# kubectl get apiservices.apiregistration.k8s.io v1.packages.operators.coreos.com NAME SERVICE AVAILABLE AGE v1.packages.operators.coreos.com olm/packageserver-service False (FailedDiscoveryCheck) 89s

root@K8s-master:~# kubectl get crds NAME CREATED AT catalogsources.operators.coreos.com 2023-05-22T09:19:42Z certificaterequests.cert-manager.io 2023-05-22T09:17:11Z certificates.cert-manager.io 2023-05-22T09:17:11Z challenges.acme.cert-manager.io 2023-05-22T09:17:11Z ciliumclusterwidenetworkpolicies.cilium.io 2023-05-22T09:02:56Z ciliumendpoints.cilium.io 2023-05-22T09:02:54Z ciliumexternalworkloads.cilium.io 2023-05-22T09:02:54Z ciliumidentities.cilium.io 2023-05-22T09:02:54Z ciliumloadbalancerippools.cilium.io 2023-05-22T09:02:54Z ciliumnetworkpolicies.cilium.io 2023-05-22T09:02:55Z ciliumnodeconfigs.cilium.io 2023-05-22T09:02:54Z ciliumnodes.cilium.io 2023-05-22T09:02:54Z clusterissuers.cert-manager.io 2023-05-22T09:17:11Z clusterserviceversions.operators.coreos.com 2023-05-22T09:19:42Z installplans.operators.coreos.com 2023-05-22T09:19:42Z issuers.cert-manager.io 2023-05-22T09:17:11Z olmconfigs.operators.coreos.com 2023-05-22T09:19:42Z operatorconditions.operators.coreos.com 2023-05-22T09:19:42Z operatorgroups.operators.coreos.com 2023-05-22T09:19:42Z operators.operators.coreos.com 2023-05-22T09:19:42Z orders.acme.cert-manager.io 2023-05-22T09:17:11Z subscriptions.operators.coreos.com 2023-05-22T09:19:42Z traces.gadget.kinvolk.io 2023-05-22T09:11:10Z

The packetserver service is fail in installing. What is wrong? Thanks.

The pods of packageserver have been created: root@K8s-master:~# kubectl get pod -n olm NAME READY STATUS RESTARTS AGE catalog-operator-594448c75b-6xfzj 1/1 Running 0 113m olm-operator-5589585cd-k7rz5 1/1 Running 0 113m operatorhubio-catalog-ljns9 1/1 Running 0 113m packageserver-5758fb5f7-cjp9h 1/1 Running 0 113m packageserver-5758fb5f7-hf8qc 1/1 Running 0 113m

However, CRD of the packageserver had not been created successfully.

grokspawn commented 1 year ago

Hi @shaojini. This is why I asked where your instruction source was. That is just one step towards installing OLM. In order to get the ability to query the packagemanifests api, you have to install it via

kubectl apply -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.24.0/olm.yaml

If you delete/create your kind cluster and follow the quickstart at https://olm.operatorframework.io/docs/getting-started/ then it includes the CRDs, APIservices, default catalog sources to be fully ready. It's a much smoother user experience.

awgreene commented 1 year ago

Created https://github.com/operator-framework/operator-lifecycle-manager/issues/2969 to document how to perform an upgrade.

shaojini commented 1 year ago

Hi @grokspawn . I really use the link given by you to install the OLM. Now the main problem is the "apiServices" can not be installed successfully. I try to find the problem with the following command:

root@K8s-master:~# kubectl describe csv packageserver -n olm

Name: packageserver Namespace: olm Labels: olm.version=v0.24.0 Annotations: olm.operatorGroup: olm-operators olm.operatorNamespace: olm olm.targetNamespaces: olm API Version: operators.coreos.com/v1alpha1 Kind: ClusterServiceVersion

.........

                Image:              quay.io/operator-framework/olm@sha256:f9ea8cef95ac9b31021401d4863711a5eec904536b449724e0f00357548a31e7
            Image Pull Policy:  Always

.........

Events: Type Reason Age From Message


Normal RequirementsUnknown 27m operator-lifecycle-manager requirements not yet checked Normal InstallWaiting 17m (x4 over 27m) operator-lifecycle-manager apiServices not installed Normal AllRequirementsMet 12m (x5 over 27m) operator-lifecycle-manager all requirements found, attempting install Warning InstallCheckFailed 12m (x6 over 22m) operator-lifecycle-manager install timeout Normal InstallSucceeded 7m40s (x9 over 27m) operator-lifecycle-manager waiting for install components to report healthy Normal NeedsReinstall 2m40s (x7 over 22m) operator-lifecycle-manager apiServices not installed

The result shows that it seems there is an image to be download from docker hub. The main problem might be the Docker Hub are limiting to pull images for the public Decker registry currently (I have met the same problem for creating a pot when pulling images from Docker hub). In order to resolve the problem of image pulling from the docker hub, it needs to provide the credential for pulling with k8s pod specification. I don't know how the problem can be resolved for the apiServices installing.

acelinkio commented 1 year ago

This can be partially avoided if you force the kubectl commands to use server-side rendering which avoid adding the last known configuration field inside of the metadata.

However, for those attempting to apply these manifests via GitOps--you will face https://github.com/kubernetes-sigs/structured-merge-diff/issues/130 where the manifests will never align. The manifests in this repository are missing the ContainerPort protocol and caused ArgoCD to barf.

There really needs to be more thought into the ease of use of this project. There is no supported way for deploying this in a declarative manner. No helm chart support. No way to deploy in a GitOps method.

ARM installation works when you use their shell script.... but still required to do a manual edit to one of the deployments.

fgiloux commented 1 year ago

Hi @acelinkit

Thanks for pointing out GitOps considerations. That's a really good point. I think that we should add the port protocol to the manifests. That said it is possible to configure ArgoCD and I believe other GitOps tools to ignore specific fields

replace can also be used as an alternative to SSA.

@shaojini I think that your initial request has been addressed. The slack channel may be a better place to help you with troubleshooting the apiservice issue. The images are pulled from quay.io not dockerhub. Looking at the logs of the olm-operator may help.

fgiloux commented 1 year ago

@acelinkit checking the latest version the port protocol has been added to the manifests, also here, last month. It should be part of the next release

acelinkio commented 1 year ago

Hey @fgiloux. Here is some additional context on my remarks.

I attempted to apply the helm chart directly inside this repo from ArgoCD using the v0.24.0 tag. The manifests missing port issue came up after doing server-side apply and would not apply as the manifests were rejected. I could have probably forced somerhing to eventually work, but I gave up because charts are not supported or the direction this project aims to support. #829

So I followed the docs.The base Readme https://github.com/operator-framework/operator-lifecycle-manager/tree/master#installation points to installation docs https://github.com/operator-framework/operator-lifecycle-manager/blob/master/doc/install/install.md. the installation docs then defer to the release page for installing a specific release. https://github.com/operator-framework/operator-lifecycle-manager/releases. The release notes have you download and run https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.24.0/install.sh

At that point I could go deconstruct the shell script to identify what it's doing. Then write a declarative way for what it's doing. So I just run the shell script and move on.

When I go to install an operator I faced another OLM problem and opened an issue. #2971. After applying a workaround my experience with operators was not great. ArgoCD operator does not have a ARM set of images. MetalLB operator had ARM images but I couldn't get it to work.

Just trying to share my user experience. Other hobbiests I have talked to outright avoid OLM entirely.

fgiloux commented 1 year ago

@acelinkit thanks for taking the time and providing this valuable feedback.

gdoteof commented 1 year ago

Just trying to share my user experience. Other hobbiests I have talked to outright avoid OLM entirely.

I am so many hours into realizing that this is a bug and not my noobery. i am also on ARM and have found that operators themselves work well enough but any attempts at using OLM are a nightmare of endless architecture related errors.

shaojini commented 1 year ago

Hi.

It seems the new version of v.0.25.0 has fixed the problem of the"OLM fails to install packageserver with FailedDiscoveryCheck error".

root@k8s-master:~# kubectl get apiservices.apiregistration.k8s.io v1.packages.operators.coreos.com NAME SERVICE AVAILABLE AGE v1.packages.operators.coreos.com olm/packageserver-service True 14m

ghost commented 11 months ago

Encountered the same problem, using version v0.25.0.

root@ubuntu:~/k8/olm# ll
total 205656
-rw-r--r-- 1 root root    773165 Sep 14 15:35 crds.yaml
-rw-r--r-- 1 root root      1886 Sep 14 15:41 install.sh
-rw-r--r-- 1 root root     10730 Sep 14 15:41 olm.yaml
-rw-r--r-- 1 root root 121468017 Sep 14 15:42 operator-lifecycle-manager_0.25.0_linux_amd64.tar.gz
-rwxr-xr-x 1 root root  88330149 Sep 14 15:29 operator-sdk_linux_amd64
root@ubuntu:~/k8/olm# kubectl apply -f crds.yaml
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created
The CustomResourceDefinition "clusterserviceversions.operators.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes
root@ubuntu:~/k8/olm# vim crds.yaml
root@ubuntu:~/k8/olm# vim crds.yaml
root@ubuntu:~/k8/olm# vim crds.yaml
root@ubuntu:~/k8/olm# kubectl get deployments -n olm
No resources found in olm namespace.
root@ubuntu:~/k8/olm# kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:34:27Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:27:46Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
ghost commented 11 months ago

This works for me.

root@ubuntu:~/k8/olm# kubectl apply -f crds.yaml  --server-side=true
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com serverside-applied
jeffcollaboro commented 9 months ago

This issue seems to have returned in v0.26.0