project-codeflare / instascale

On-demand Kubernetes/OpenShift cluster scaling and aggregated resource provisioning
Apache License 2.0
10 stars 20 forks source link

Client side throttling #173

Open VanillaSpoon opened 1 year ago

VanillaSpoon commented 1 year ago

Describe the Bug

Instascale appears to be experiencing client-side throttling from the concurrent requests to the Kubernetes API.

Upon setup the logs display:

I1109 12:21:14.184474 1 request.go:690] Waited for 1.046770548s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/nfd.openshift.io/v1alpha1?timeout=32s

I'm opening this issue to start a discussion on reasonable solutions on overcoming the client-side throttling

Codeflare Stack Component Versions

Please specify the component versions in which you have encountered this bug. Instascale

What Have You Already Tried to Debug the Issue?

Whilst looking into causes, I increased the Burst, and QPS values for restconfig from their defaults(qps=20, burst=30) to 50, and 100. This resolved the throttling, however, as Client-side throttling is a protective measure to prevent the Kubernetes API server from being overwhelmed by too many requests in a short amount of time. and QPS defines the number of queries per second the client can make beyond which throttling is expected to happen, and Burst defines the maximum burst for throttle-free requests, changing these values is not a solution but rather a means of finding where the throttling was occurring.

Screenshots, Console Output, Logs, etc.

Add screenshots of UIs (like dashboards), etc. that help explain the issue.

│ I1109 12:21:14.184474       1 request.go:690] Waited for 1.046770548s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/nfd.openshift.io/v1alpha1?timeout=32s                             │
│ 2023-11-09T12:21:16Z    INFO    controller-runtime.metrics    Metrics server is starting to listen    {"addr": "127.0.0.1:8080"}                                                                                                           │
│ 2023-11-09T12:21:16Z    INFO    setup    starting manager                                                                                                                                                                                  │
│ 2023-11-09T12:21:16Z    INFO    Starting server    {"kind": "health probe", "addr": ":8081"}                                                                                                                                               │
│ 2023-11-09T12:21:16Z    INFO    Starting server    {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}                                                                                                                       │
│ I1109 12:21:16.841811       1 leaderelection.go:248] attempting to acquire leader lease instascale-system/03fb6faf.my.domain...                                                                                                            │
│ I1109 12:21:35.107635       1 leaderelection.go:258] successfully acquired lease instascale-system/03fb6faf.my.domain          

Add any other information you think might be useful here.