tmforum-oda / oda-canvas

Apache License 2.0
19 stars 50 forks source link

GKE Autopilot Incompatibility #226

Closed vances closed 5 months ago

vances commented 5 months ago

Long story short

There appears to be an incompatibility with the Autopilot mode of operation in Google Kubernetes Engine (GKE).

Description

Following the installation instructions mostly works however compcrdwebhook fails to start. Increasing the timeout, as documented, does not have any effect.

Looking at logs we see that the problem is: denied by managed-namespaces-limitation.

The exact command to reproduce the issue ```bash gcloud container clusters create-auto oda-cluster gcloud container clusters get-credentials oda-cluster helm repo add istio https://istio-release.storage.googleapis.com/charts helm repo update kubectl create namespace istio-system helm install istio-base istio/base -n istio-system --set defaultRevision=default helm install istiod istio/istiod -n istio-system --wait kubectl create namespace istio-ingress helm install istio-ingress istio/gateway -n istio-ingress --wait helm repo add oda-canvas https://tmforum-oda.github.io/oda-canvas helm repo update helm install --set cert-manager.leaseWaitTimeonStartup=100 canvas oda-canvas/canvas-oda -n canvas --create-namespace ```
The full output of the command that failed ``` $ kubectl logs pod/canvas-cert-manager-cainjector-b487b6c49-bdt7r -n cert-manager ... I0417 06:18:46.915621 1 leaderelection.go:248] attempting to acquire leader lease kube-system/cert-manager-cainjector-leader-election... E0417 06:18:46.951512 1 leaderelection.go:334] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:cert-manager:canvas-cert-manager-cainjector" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system": GKE Warden authz [denied by managed-namespaces-limitation]: the namespace "kube-system" is managed and the request's verb "create" is denied ```

Environment

Google Kubernetes Engine (GKE) Autopilot

gusjer commented 5 months ago

In Autopilot, GKE takes complete control of the control plane of the Kubernetes cluster Cert-Manager tries to create a Lease object in the kube-system namespace to elect a leader, but this namespace is mainly used for the control plane, and Autopilot get its entire control, not allowing others to use it This problem has been reported in Cert-Manager more than once https://github.com/cert-manager/cert-manager/issues/5625

I guess you should install the canvas using that helm install --set cert-manager.leaseWaitTimeonStartup=100 --set global.leaderElection.namespace=cert-manager canvas oda-canvas/canvas-oda -n canvas --create-namespace I have no access to a Google Cloud account, so I have no way to test my guessing

vances commented 5 months ago

Thank you @gusjer, changing the leader election namespace did the trick!

gcloud container clusters create-auto oda-cluster
gcloud container clusters get-credentials oda-cluster
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
kubectl create namespace istio-system
helm install istio-base istio/base -n istio-system --set defaultRevision=default
helm install istiod istio/istiod -n istio-system --wait
kubectl create namespace istio-ingress
helm install istio-ingress istio/gateway -n istio-ingress --wait
helm repo add oda-canvas https://tmforum-oda.github.io/oda-canvas
helm repo update
helm install --set global.leaderElection.namespace=cert-manager canvas oda-canvas/canvas-oda -n canvas --create-namespace