Open max-allan-surevine opened 3 years ago
@max-allan-surevine Do you mind showing the command you are running with the flags?
Oops! Yes, how did I miss that, will edit! It was on the same line as my triple quote so got swallowed by the markdown.
Ok, some things I noticed. You are using a timeout of 10 seconds (--timeout 10s
). Do you want this to be longer? Also, can you try passing the --wait
flag?
I set the 10s timeout so that it should timeout before the 11second wait. To highlight the fact that it is not respecting the timeout value I set. I would actually want it to be higher, but setting it to less than the 11s message highlights that it is using neither the 5m default or the 10s supplied value.
[master] $ helm delete files --timeout 10s --wait
Error: unknown flag: --wait
[master] $ helm delete files --timeout 10s
I0616 10:59:21.444921 41729 request.go:668] Waited for 1.176294145s due to client-side throttling, not priority and fairness, request: GET:https://api.local:443/apis/pipelines.openshift.io/v1alpha1?timeout=32s
I0616 10:59:31.446800 41729 request.go:668] Waited for 11.177602333s due to client-side throttling, not priority and fairness, request: GET:https://api.local:443/apis/monitoring.coreos.com/v1?timeout=32s
release "files" uninstalled
[master] $ helm install files --timeout 10s --wait -f ../files.yaml chart
I0616 11:00:04.039816 41786 request.go:668] Waited for 1.167701664s due to client-side throttling, not priority and fairness, request: GET:https://api.local:443/apis/workspace.devfile.io/v1alpha1?timeout=32s
I0616 11:00:14.238909 41786 request.go:668] Waited for 11.366030019s due to client-side throttling, not priority and fairness, request: GET:https://api.local:443/apis/caching.internal.knative.dev/v1alpha1?timeout=32s
Error: timed out waiting for the condition
[master] $ helm install files --timeout 10s --wait -f ../files.yaml chart
Error: cannot re-use a name that is still in use
The "Error: timed out" happens after about 30s. Not the default 5m0s that "--timeout" is set to according to the docs and not the 10s I set on the CLI. With a 10s timeout, I should never see the "waited for 11s" message. Right?
And now I have a deployment which is in who knows what state? Clearly something timed out and failed, but something successfully completed. It didn't wait for 5 minutes or 10secs. If it did wait for 5mins, this error probably wouldn't happen.
Hence the title of the bug : Cannot change the timeout on API calls Whatever I set on the CLI , it always uses 32s.
[master] $ helm delete --timeout 5m0s files
I0616 11:10:10.950751 42031 request.go:668] Waited for 1.153073128s due to client-side throttling, not priority and fairness, request: GET:https://api.local:443/apis/jenkins.io/v1alpha3?timeout=32s
I0616 11:10:21.150205 42031 request.go:668] Waited for 11.352028467s due to client-side throttling, not priority and fairness, request: GET:https://api.local:443/apis/planetscale.com/v1alpha1?timeout=32s
release "files" uninstalled
Still ends each API call with "?timeout=32s"
Still ends each API call with "?timeout=32s"
This is the timeout for individual requests, which I'd expect client-go
to retry performing. This timeout is also configured when creating rest client from kubeconfig. In case --timeout 10s
is given, I'd expect context for the request to be cancelled, so then the error message you get should be different.
Also given that release has been uninstalled, the message seems to be only a warning, right?
This issue seems like a feature request to be able to configure this default: https://github.com/soltysh/kubernetes/blob/7bd48a7e2325381cb777d0ea1ff89b2ecece23b6/staging/src/k8s.io/client-go/discovery/discovery_client.go#L51
From the help for install : --timeout duration time to wait for any individual Kubernetes operation (like Jobs for hooks) (default 5m0s)
Is creating an object like a secret or a deployment or ...whatever it is doing... not an "individual operation" ??? What is an individual Kubernetes operation?
Going by the documentation of --timeout, this is not a feature request. It is at least a bug with the documentation of what timeout actually means. But I'd prefer it if someone fixed the timeout rather than redocumenting it.
Yes, it is a warning, but sometimes if the cluster or network is slow it is an error. "Error: timed out waiting for the condition" And if it is slow to complete then the rollback operations can be slow too and sometimes exceed the 32s timeout and the rollback fails to complete successfully leaving a mess.
@max-allan-surevine good points. I think the documentation for --timeout
could also be clarified then. Looking briefly at the code, it seems Timeout is only used for executing hooks if you don't specify --wait
? I think improving documentation should be treated as a separate issue from the timeouts I mentioned before.
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
Not stale please
I'm also running into issues with this when installing large helm charts due to our VPN. Being able to set a timeout or throttle concurrent calls would be extremely helpful.
A good example is this chart which installs many sub charts: https://github.com/newrelic/helm-charts/tree/master/charts/nri-bundle
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
This is still a problem.
The solution is simple as 2*2. One needs to add new command-line argument like "--api-server-timeout" for helm and pass it's value to client-go library.
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
Still relevant
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
This is still a problem.
Still a problem.
Can someone suggest a workaround please? Retries aren't helping us as we have a VPN between our on-prem network and the cloud V-Net which can become choked for many hours.
Maybe run helm from a pod or vm that doesn't cross a vpn?
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
Since it's been a while since my suggestion and there's been no further conversation about this, I'm going to go ahead and close it.
We are still facing problem on clusters having 100+ CRDs
Same here, random timeouts, would love the option to change API calls timeout
Please support this.
@joejulian, could we reopen this? We are running on microk8s directly against the host. /openapi/v3
endpoints can take >30 seconds to return the schema with large amount of CRDs on the cluster.
I don't think we can also address the https://github.com/hashicorp/terraform-provider-helm/issues/1156, until this is addressed
Sure, done. 🙌
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
Hi, this is a requested feature within our organization as well, could somone take a look the review above?
Thank you.
My organisation's openshift cluster has many CRDs and throttles the client connection (If I understand it correctly). Often when busy the throttling/performance is so bad that helm operations fail. I'd like to increase the timeout on the API calls. Which looks like a "--timeout" setting. However, if I try to change the timeout (to a value lower than the typical throttle delay) it still appears to have a 32s timeout... (And doesn't fail due to the request taking too long.)
Example of a fail looks the same as above, but after the last "waited for" I see :
(I use --atomic normally now because of this problem!)
I would like to be able to increase the timeout from 32s to a higher value. I know the API server is overloaded and would rather helm wait a few more seconds for it than ME have to wait till 4AM to deploy my helm chart when nobody else is around....
Output of
helm version
:Output of
kubectl version
: kubectl has been removed. There was a suggestion this issue was fixed in recent versions of the openshift client (oc)Cloud Provider/Platform (AKS, GKE, Minikube etc.): Openshift