fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
409 stars 164 forks source link

Lots of client-side throttling on clusters with Azure Service Operator #567

Closed masterphenix closed 2 weeks ago

masterphenix commented 1 year ago

Hello, We have a huge amount of client-side throttling errors showing in the Helm Controller, on a cluster that has 150 CRDs, most of them deployed by Azure Service Operator (which is an equivalent of Crossplane, but specifically for Azure). In fact, the controller is so much throttled that reconciling helmreleases fails with this error:

✗ client rate limiter Wait returned an error: context deadline exceeded

Example of client-side throttling errors:

I1124 09:38:12.606756       7 request.go:682] Waited for 1.014627573s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s
I1124 09:38:23.277039       7 request.go:682] Waited for 1.014878006s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/signalrservice.azure.com/v1alpha1api20211001?timeout=32s
I1124 09:38:33.700502       7 request.go:682] Waited for 1.004668059s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/rabbitmq.com/v1alpha1?timeout=32s
I1124 09:38:43.937738       7 request.go:682] Waited for 1.007247339s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/dbforpostgresql.azure.com/v1alpha1api20210601storage?timeout=32s
I1124 09:38:54.428014       7 request.go:682] Waited for 1.015669903s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/authorization.azure.com/v1beta20200801preview?timeout=32s
I1124 09:39:04.657925       7 request.go:682] Waited for 1.000940206s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/policy/v1beta1?timeout=32s
I1124 09:39:15.203590       7 request.go:682] Waited for 1.009544821s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/compute.azure.com/v1beta20201201storage?timeout=32s

Version of the controller is v0.26.0

We do not encounter such behavior on clusters with less CRDs (without Azure Service Operator).

angelbarrera92 commented 1 year ago

Same here. In my case, with crossplane CRD... same controller version 0.26.0. Same flux - helm-controller configuration without crossplane CRDs works fine.

It causes issues on the helm-controller pods, so it stops it and can not run for more than a few seconds.

Why does this controller need to query other than Helm-related CRD?