woodpecker-ci / autoscaler

Scale your woodpecker agents automatically to the moon and back based on the current load.
Apache License 2.0
30 stars 5 forks source link

Specify agent delete grace period #73

Closed anbraten closed 2 months ago

anbraten commented 8 months ago

Currently unused agents are deleted as soon as the reconciliation loop runs again, instead it could be helpful to have some grace period before removing agents.

xoxys commented 7 months ago

This is exactly what WOODPECKER_AGENT_ALLOWED_STARTUP_TIME does but as you have renamed this for some reason from WOODPECKER_MIN_AGE in https://github.com/woodpecker-ci/autoscaler/pull/5/commits/51e63152b4e8ff10c081350b9a440321bef6b4c3 this var name is now misleading...

anbraten commented 7 months ago

Not really. That way you would just make sure an agent is alive for at least x minutes. Imagine an agent being actively doing tasks for 1 hour. After this it has nothing to do for a few seconds and gets immediately removed instead of waiting for x more minutes in which it might get some more tasks. So it has to be last task done + x minutes

OrvilleQ commented 4 months ago

I really hope this feature will be implemented in the near future.

I also had a problem with autoscaler deleting agents so fast that if I push a new commit and woodpecker cancels the last workflow, I have to wait another 4 or 5 minutes to create a new agent. This is really annoying.

OrvilleQ commented 4 months ago

I also had a problem with autoscaler deleting agents so fast that if I push a new commit and woodpecker cancels the last workflow, I have to wait another 4 or 5 minutes to create a new agent. This is really annoying.

And also, in order to achieve as much cost optimization as possible, there should probably have a smarter agent removal rule set.

Take Hetzner for example, if I understand correctly they calculate the cost of VPS and IPs on an hourly basis. If the associated resource is used for less than an hour, it is counted as an hour.

For maximum cost optimization, it might be a good idea to have an idle window and a deletion window. When an Agent is created, it enters an idle window of, say, 58 minutes. The Agent should not be deleted during this time, even if the CI/CD is not running, because the service provider will still charge an hourly rate even if the Agent is recycled. After the idle window ends, enter the deletion window, e.g., 2 minutes. During this time the Agent should be deleted if it meets the conditions and if it is still running then it should go to the next idle window.

anbraten commented 4 months ago

So we need somehow the following options, right?: