buildkite / elastic-ci-stack-for-aws

An auto-scaling cluster of build agents running in your own AWS VPC
https://buildkite.com/docs/quickstart/elastic-ci-stack-aws
MIT License
417 stars 275 forks source link

Option to hibernate agents after a timeout #1348

Open gondalez opened 4 months ago

gondalez commented 4 months ago

Is your feature request related to a problem? Please describe.

Background

Using warm/always-on agents is great for improving build times for disk-cache heavy operations like fetching and building dependencies.

Warm agents are those with autoscaling disabled that are always on so their disk cache is maintained between builds.

Problem

Having the agents always on adds cost.

Describe the solution you'd like

Use ec2 hibernation.

I imagine a variable like ScaleInIdlePeriod where the instances in the group hibernate after an idle period.

For warm agents this seems great because the disk cache is maintained while dropping the instance cost. Only the storage costs would be accumulating during hibernation

Also, coming out of hibernation should be quick for when a new build starts.

wolfeidau commented 3 months ago

@gondalez would be great to see how this would work with autoscaling, we would be open to looking at an implementation in a PR of some sort but it does seem fairly specialised.