upmc-enterprises / registry-creds

Allow for AWS ECR, Google Registry, & Azure Container Registry credentials to be refreshed inside your Kubernetes cluster via ImagePullSecrets
Other
344 stars 124 forks source link

client-go: No cached connection was available #92

Open Pwntus opened 4 years ago

Pwntus commented 4 years ago

Hi, I'm running this on AKS and it works perfectly for a day.

After that it will silently fail and when I view the pod-log I see a continuous list of the following messages each second:

ERROR: logging before flag.Parse: E0414 12:25:24.718421       1 reflector.go:199] github.com/upmc-enterprises/registry-creds/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://<redacted>.hcp.northeurope.azmk8s.io:443/api/v1/namespaces?resourceVersion=0: http2: no cached connection was available

It appears to originate from the k8s Go-client, where a cached connection is missing. Could it be a bug in the client or is it maybe something happening on the AKS level?

Specifically here: https://github.com/upmc-enterprises/registry-creds/blob/master/vendor/k8s.io/client-go/tools/cache/reflector.go#L198

Maybe the Go-client should be updated?

Thankful for any suggestions!

deini commented 4 years ago

I just started having the same issue. Also running AKS.

Pwntus commented 4 years ago

Seems to be an issue with Golang and HTTP/2.

Related issues: https://github.com/jcmoraisjr/haproxy-ingress/issues/467 https://github.com/kubernetes/kubernetes/issues/49740

I will try to build this repo again with updated k8s Go-client.

donifer commented 4 years ago

Did you guys find a workaround? Seeing the same issue and need to constantly recreate the registry-creds pod.

Pwntus commented 4 years ago

No, I did not manage to upgrade Golang in this repo. I'm still getting this issue but it happens very irregularly, so I very much suspect it's a race condition in HTTP/2. Temporary fix is to recreate the deployment, but you will never know when it fails.

akshayks commented 4 years ago

@Pwntus I have a fork of registry-creds built against the latest version of the Kubernetes golang SDK. All tests seem to pass but I have not yet deployed the container to a Kubernetes cluster and observed its behavior to determine if the original issue has been addressed. I can submit a PR if you think it will be useful.

Pwntus commented 4 years ago

@akshayks Thanks, I will try your version asap and report back if I experience the same issue.

stevesloka commented 4 years ago

Hey everyone, I can help get this repo updated if you would like.

Pwntus commented 4 years ago

@stevesloka That would be wonderful!

akshayks commented 4 years ago

@Pwntus I forked this repo and pushed changes to the client-go-v0.19.3 branch. I will hopefully get around to testing it myself sometime this week. In the meanwhile, let me know how your testing proceeds.

jfdumont commented 3 years ago

Same issue on 2 almost identical Rancher cluster :

Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}

registry-creds: Image: upmcenterprises/registry-creds:1.10

reschedule registry-cred pod resolve issue for a while

akshayks commented 3 years ago

@Pwntus Did you have a chance to test out my fork yet?

Gegonz commented 3 years ago

Any update on this topic? For now we have to setup a cronjob to restart registry-creds every 12h

Keralin commented 3 years ago

This is something that happens sometimes and we have set up a liveness probe looking logs. If it works with the updated SDK can we push it to master?

malcolm061990 commented 1 year ago

@stevesloka Do you plan to fix this issue?