Open Pwntus opened 4 years ago
I just started having the same issue. Also running AKS.
Seems to be an issue with Golang and HTTP/2.
Related issues: https://github.com/jcmoraisjr/haproxy-ingress/issues/467 https://github.com/kubernetes/kubernetes/issues/49740
I will try to build this repo again with updated k8s Go-client.
Did you guys find a workaround? Seeing the same issue and need to constantly recreate the registry-creds
pod.
No, I did not manage to upgrade Golang in this repo. I'm still getting this issue but it happens very irregularly, so I very much suspect it's a race condition in HTTP/2. Temporary fix is to recreate the deployment, but you will never know when it fails.
@Pwntus I have a fork of registry-creds
built against the latest version of the Kubernetes golang SDK. All tests seem to pass but I have not yet deployed the container to a Kubernetes cluster and observed its behavior to determine if the original issue has been addressed. I can submit a PR if you think it will be useful.
@akshayks Thanks, I will try your version asap and report back if I experience the same issue.
Hey everyone, I can help get this repo updated if you would like.
@stevesloka That would be wonderful!
@Pwntus I forked this repo and pushed changes to the client-go-v0.19.3 branch. I will hopefully get around to testing it myself sometime this week. In the meanwhile, let me know how your testing proceeds.
Same issue on 2 almost identical Rancher cluster :
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
registry-creds: Image: upmcenterprises/registry-creds:1.10
reschedule registry-cred pod resolve issue for a while
@Pwntus Did you have a chance to test out my fork yet?
Any update on this topic? For now we have to setup a cronjob to restart registry-creds every 12h
This is something that happens sometimes and we have set up a liveness probe looking logs. If it works with the updated SDK can we push it to master?
@stevesloka Do you plan to fix this issue?
Hi, I'm running this on AKS and it works perfectly for a day.
After that it will silently fail and when I view the pod-log I see a continuous list of the following messages each second:
It appears to originate from the k8s Go-client, where a cached connection is missing. Could it be a bug in the client or is it maybe something happening on the AKS level?
Specifically here: https://github.com/upmc-enterprises/registry-creds/blob/master/vendor/k8s.io/client-go/tools/cache/reflector.go#L198
Maybe the Go-client should be updated?
Thankful for any suggestions!