buildkite / buildkite-agent-scaler

📈A lambda for scaling an AutoScalingGroup based on Buildkite metrics
MIT License
61 stars 27 forks source link

Scaling on Waiting has unexpected behavior #199

Open baderbuddy opened 1 week ago

baderbuddy commented 1 week ago

Because every job behind a block is considered waiting, if you turn on scaling based on waiting it could cause a lot of unexpected scaling. In our organization we do a lot of block steps for optional things like rollbacks, that most of hte tiem are never executed, so we have plenty of jobs that are stuck in a waiting stage that never get cleaned up.

Having a little more logic in there to exclude older blocked steps, or jobs behind a block would make this option much more useful.

DrJosh9000 commented 1 day ago

Thanks @baderbuddy, that's a good point.

Currently the metrics endpoint only exposes a single "waiting" metric which includes all kinds of waiting jobs. But I think that even if we did break the metric down into different categories (such as behind a block vs not), there might not be a common rule that we could apply to all kinds of pipeline.