buildkite / agent-stack-k8s

Spin up an autoscaling stack of Buildkite Agents on Kubernetes
MIT License
79 stars 30 forks source link

Prevent node pool autoscaler from evicting jobs in progress #235

Closed c2h5oh closed 9 months ago

c2h5oh commented 10 months ago

I've noticed that GKE node pool autoscaler will evict jobs in progress if autoscaling policy is set to optimize-utilization. Default setting (balanced) doesn't have that problem.

Based on GKE autoscaling policy docs this can be prevented by adding cluster-autoscaler.kubernetes.io/safe-to-evict: "false" annotation.

This is not a GKE specific thing: https://kubernetes.io/docs/reference/labels-annotations-taints/#cluster-autoscaler-kubernetes-io-safe-to-evict