kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
29.56k stars 4.89k forks source link

minikube addons enable gcp-auth --refresh hangs forever #14897

Open henryrior opened 2 years ago

henryrior commented 2 years ago

What Happened?

My team uses the gcp-auth addon. minikube addons enable gcp-auth works fine, but when we add the --refresh flag to rotate credentials, it hangs forever. Adding the --alsologtostderr flag shows that it gets to here and then hangs indefinitely:

I0901 16:26:05.539862 1822 out.go:177] ▪ Using image gcr.io/k8s-minikube/gcp-auth-webhook:v0.0.10 ▪ Using image gcr.io/k8s-minikube/gcp-auth-webhook:v0.0.10 I0901 16:26:05.560484 1822 out.go:177] ▪ Using image k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0 ▪ Using image k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0 I0901 16:26:05.579669 1822 addons.go:345] installing /etc/kubernetes/addons/gcp-auth-ns.yaml I0901 16:26:05.579684 1822 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-ns.yaml (700 bytes) I0901 16:26:05.594879 1822 addons.go:345] installing /etc/kubernetes/addons/gcp-auth-service.yaml I0901 16:26:05.594895 1822 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-service.yaml (788 bytes) I0901 16:26:05.609378 1822 addons.go:345] installing /etc/kubernetes/addons/gcp-auth-webhook.yaml I0901 16:26:05.609393 1822 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-webhook.yaml (4843 bytes) I0901 16:26:05.622049 1822 ssh_runner.go:195] Run: sudo KUBECONFIG=/var/lib/minikube/kubeconfig /var/lib/minikube/binaries/v1.24.3/kubectl apply -f /etc/kubernetes/addons/gcp-auth-ns.yaml -f /etc/kubernetes/addons/gcp-auth-service.yaml -f /etc/kubernetes/addons/gcp-auth-webhook.yaml

While we can workaround this by manually deleting the gcp-auth secret with kubectl delete secret gcp-auth and re-running the enable command, this has caused issues in automated scripts.

Attach the log file

logs.txt

Operating System

macOS (Default)

Driver

No response

klaases commented 2 years ago

Hi @henryrior, did this work ok in the past, or is this something that has not yet worked?

Code Reference: https://github.com/kubernetes/minikube/blob/c5d2c652aa170f8e00d91289f2deb8cb52dfe441/pkg/addons/addons_gcpauth.go#L251

henryrior commented 2 years ago

Hey @klaases , assuming this is the --refresh logic, unfortunately I just started using minikube last month so I can't say if it used to work. I can ask around my colleagues, however the refresh flag is currently hanging for them too.

spowelljr commented 2 years ago

Hi @henryrior, could you run the command with the --alsologtostderr flag and upload the output so we can see what is occurring, thanks.

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

bastiankistner commented 1 year ago

This is the output I see:


❯ minikube addons enable gcp-auth --refresh --alsologtostderr

I0228 09:14:57.264549   17806 out.go:296] Setting OutFile to fd 1 ...
I0228 09:14:57.265492   17806 out.go:348] isatty.IsTerminal(1) = true
I0228 09:14:57.265500   17806 out.go:309] Setting ErrFile to fd 2...
I0228 09:14:57.265505   17806 out.go:348] isatty.IsTerminal(2) = true
I0228 09:14:57.266003   17806 root.go:334] Updating PATH: /Users/bastian/.minikube/bin
I0228 09:14:57.266014   17806 oci.go:567] shell is pointing to dockerd inside minikube. will unset to use host
I0228 09:14:57.279324   17806 out.go:177] 💡  gcp-auth is an addon maintained by Google. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
💡  gcp-auth is an addon maintained by Google. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
I0228 09:14:57.286653   17806 config.go:180] Loaded profile config "minikube": Driver=docker, ContainerRuntime=docker, KubernetesVersion=v1.25.3
I0228 09:14:57.287298   17806 addons.go:65] Setting gcp-auth=true in profile "minikube"
I0228 09:14:57.287481   17806 mustload.go:65] Loading cluster: minikube
I0228 09:14:57.287562   17806 config.go:180] Loaded profile config "minikube": Driver=docker, ContainerRuntime=docker, KubernetesVersion=v1.25.3
I0228 09:14:57.289189   17806 cli_runner.go:164] Run: docker container inspect minikube --format={{.State.Status}}
I0228 09:14:57.416840   17806 host.go:66] Checking if "minikube" exists ...
I0228 09:14:57.417350   17806 cli_runner.go:164] Run: docker container inspect -f "'{{(index (index .NetworkSettings.Ports "8443/tcp") 0).HostPort}}'" minikube
I0228 09:14:57.492605   17806 ssh_runner.go:362] scp memory --> /var/lib/minikube/google_application_credentials.json (295 bytes)
I0228 09:14:57.492705   17806 cli_runner.go:164] Run: docker container inspect -f "'{{(index (index .NetworkSettings.Ports "22/tcp") 0).HostPort}}'" minikube
I0228 09:14:57.540621   17806 sshutil.go:53] new ssh client: &{IP:127.0.0.1 Port:53131 SSHKeyPath:/Users/bastian/.minikube/machines/minikube/id_rsa Username:docker}
I0228 09:14:58.109854   17806 ssh_runner.go:362] scp memory --> /var/lib/minikube/google_cloud_project (11 bytes)
I0228 09:14:58.141936   17806 addons.go:227] Setting addon gcp-auth=true in "minikube"
W0228 09:14:58.141961   17806 addons.go:236] addon gcp-auth should already be in state true
I0228 09:14:58.142372   17806 host.go:66] Checking if "minikube" exists ...
I0228 09:14:58.142670   17806 cli_runner.go:164] Run: docker container inspect minikube --format={{.State.Status}}
I0228 09:14:58.185983   17806 ssh_runner.go:195] Run: cat /var/lib/minikube/google_application_credentials.json
I0228 09:14:58.186045   17806 cli_runner.go:164] Run: docker container inspect -f "'{{(index (index .NetworkSettings.Ports "22/tcp") 0).HostPort}}'" minikube
I0228 09:14:58.233280   17806 sshutil.go:53] new ssh client: &{IP:127.0.0.1 Port:53131 SSHKeyPath:/Users/bastian/.minikube/machines/minikube/id_rsa Username:docker}
I0228 09:14:58.332163   17806 out.go:177]     ▪ Using image gcr.io/k8s-minikube/gcp-auth-webhook:v0.0.13
    ▪ Using image gcr.io/k8s-minikube/gcp-auth-webhook:v0.0.13
I0228 09:14:58.342215   17806 out.go:177]     ▪ Using image k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0
    ▪ Using image k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0
I0228 09:14:58.347683   17806 addons.go:419] installing /etc/kubernetes/addons/gcp-auth-ns.yaml
I0228 09:14:58.347694   17806 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-ns.yaml (700 bytes)
I0228 09:14:58.365320   17806 addons.go:419] installing /etc/kubernetes/addons/gcp-auth-service.yaml
I0228 09:14:58.365335   17806 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-service.yaml (788 bytes)
I0228 09:14:58.381300   17806 addons.go:419] installing /etc/kubernetes/addons/gcp-auth-webhook.yaml
I0228 09:14:58.381317   17806 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-webhook.yaml (5389 bytes)
I0228 09:14:58.396525   17806 ssh_runner.go:195] Run: sudo KUBECONFIG=/var/lib/minikube/kubeconfig /var/lib/minikube/binaries/v1.25.3/kubectl apply -f /etc/kubernetes/addons/gcp-auth-ns.yaml -f /etc/kubernetes/addons/gcp-auth-service.yaml -f /etc/kubernetes/addons/gcp-auth-webhook.yaml
spowelljr commented 1 year ago

The command that's hanging is setup on retry but the timeout is 2 minutes which is likely much too long. I'm thinking we should bump down the timeout and then maybe this will be resolved on retry.

spowelljr commented 1 year ago

@bastiankistner Did you create the cluster with an older version of minikube and then you updated the minikube binary since? Also, is this a consistent issue? ie. If you cancel the command and try again does it still fail?

spowelljr commented 1 year ago

This is most likely an infinite retry without timeout in refreshExistingPods, which would explain why we only ever see it --refresh. Will fix this up in a bit.

bastiankistner commented 1 year ago

@bastiankistner Did you create the cluster with an older version of minikube and then you updated the minikube binary since? Also, is this a consistent issue? ie. If you cancel the command and try again does it still fail?

That might indeed be the case. Is it common to have issues when I upgrade the binary after the cluster was created?

My temporary workaround is the following:

kubectl --namespace=${NAMESPACE} create secret docker-registry europe-west1-docker-pkg-dev-pull-secret \
        --docker-server=https://europe-west1-docker.pkg.dev \
        --docker-username=oauth2accesstoken \
        --docker-password="$(gcloud auth print-access-token)" \
        --docker-email=a@b.com \
        --save-config \
        --dry-run=client -o yaml | kubectl apply -f -

But I assume that the GOOGLE_APPLICATION_CREDENTIALS might also expire and therefore having a working solution for both would be great.

It is a consistent issue. The refresh command keeps has never succeeded so far. But what indeed also works is just disabling the addon and re-enabling it. This completes successfully.

spowelljr commented 1 year ago

That might indeed be the case. Is it common to have issues when I upgrade the binary after the cluster was created?

It looks like if that was the issue you should have seen an apply error which you don't have

apply failed, will retry: sudo KUBECONFIG=/var/lib/minikube/kubeconfig /var/lib/minikube/binaries/v1.21.2/kubectl apply -f /etc/kubernetes/addons/gcp-auth-ns.yaml -f /etc/kubernetes/addons/gcp-auth-service.yaml -f /etc/kubernetes/addons/gcp-auth-webhook.yaml: Process exited with status 1

I'm working on a branch and have the apply failure resolved along with another bug you're not experiencing. I'm then going to add a timeout to the infinite retry and add some logging to see if that's where you're experiencing the issue. I'll give you a link to the binary with the fixes once I have a PR up. What OS do you use so I can provide the correct binary to you?

bastiankistner commented 1 year ago

I'm running macOS m1 (darwin/arm64)

spowelljr commented 1 year ago

Here's a binary you can use, make sure to run it with --alsologtostderr as it has improved logging. Let me know the result

https://github.com/kubernetes/minikube/releases/latest/download/minikube-darwin-arm64