wbuchwalter / Kubernetes-acs-engine-autoscaler

[Deprecated] Node-level autoscaler for Kubernetes clusters created with acs-engine.
Other
71 stars 22 forks source link

Autoscaler does not respect resource limits #52

Closed jackzampolin closed 7 years ago

jackzampolin commented 7 years ago

The autoscaler fails when running up against resource limits and gets into a state where it doesn't scale at all!

2017-09-01 22:27:47,081 - autoscaler.scaler - INFO - ====Scaling for 40 pods ====
2017-09-01 22:27:47,085 - autoscaler.scaler - DEBUG - units_needed: 10
2017-09-01 22:27:47,085 - autoscaler.scaler - DEBUG - units_requested: 10
2017-09-01 22:27:47,085 - autoscaler.scaler - DEBUG - k8sagent actual capacity: 1 , units requested: 10
2017-09-01 22:27:47,085 - autoscaler.scaler - INFO - New capacity requested for pool k8sagent: 11 agents (current capacity: 1 agents)
2017-09-01 22:27:47,086 - autoscaler.scaler - DEBUG - remaining pending: 0
2017-09-01 22:27:47,103 - autoscaler.engine_scaler - INFO - Deployment autoscaler-deployment-2453c506 started...
2017-09-01 22:27:48,280 - autoscaler.cluster - ERROR - Azure Error: InvalidTemplateDeployment
Message: The template deployment 'autoscaler-deployment-2453c506' is not valid according to the validation procedure. The tracking id is '2aead6c9-8a9d-4e0c-93f9-94617821a646'. See inner errors for details. Please see https://aka.ms/arm-deploy for usage details.
Exception Details:
    Error Code: QuotaExceeded
    Message: Operation results in exceeding quota limits of Core. Maximum allowed: 20, Current in use: 6, Additional requested: 40.
    Target: None
2017-09-01 22:27:48,282 - autoscaler - WARNING - backoff: 120
jackzampolin commented 7 years ago

I've found a pretty good way around this. After increasing the resource limits on Azure, I also decreased the sleeptime to 15 seconds so the autoscaler requests nodes in smaller batches. I'm going to go ahead and close this.