atlassian / escalator

Escalator is a batch or job optimized horizontal autoscaler for Kubernetes
Apache License 2.0
662 stars 59 forks source link

[BUG] Large pods may be neglected if pod's requested resources do not trigger scale up #224

Closed hittingray closed 1 year ago

hittingray commented 1 year ago

Describe the bug If a large pod is requested, but it does not push the total requested resources past the scale up threshold, while no individual node has required resources to schedule it, it may never be scheduled or take a long time to get scheduled.

To Reproduce Consider the following scenario, where each node has 1000m CPU allocatable and there are no pods pending scheduling. The state of the cluster looks like:

Node Name Sum of pod requested CPU
Node 1 850m
Node 2 850m
Node 3 850m
Node 4 850m
Node 5 850m

In this state, the cluster total requested usage is 4250m (85%). Consider if we had a scale-up threshold of 90% (4500m). If we then attempted to schedule a pod with a CPU request of 200m, it would push the total requested usage to 4450m (89%), which is not enough to trigger a scale-up. Since no node has more than 150m CPU available, this pod would not be able to be scheduled on to any node, and may be pending indefinitely unless another pod is submitted to the cluster which triggers the scale up.

Expected behavior Escalator recognises that despite not being over the scale-up threshold, there is an unscheduled pod that is not able to be scheduled onto any of the currently available nodes, and triggers a scale-up.

Screenshots or Logs

Kubernetes Cluster Version v1.24.12

Escalator Version v1.13.1 (or whatever version we are using internally)

Additional context