madecoste / swarming

Automatically exported from code.google.com/p/swarming
Apache License 2.0
0 stars 1 forks source link

Use Google Compute Engine Autoscaler and Instance Group Manager API #189

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Goal:
Reduce the utilization on low period and increase dynamically in high usage 
periods using Autoscaler.
https://cloud.google.com/compute/docs/autoscaler/

I propose the following instance-specific tuning parameters:
- A minimum number of persistent instances.
- B maximum number of dynamic instances.
- C minimum number of idle bots at any time.
- D maximum number of idle bots at any time.
- E predefined time from requesting a new VM to time it's available.

Action Item:
- MVP is to create a cronjob which calls the Autoscaler API[1] to either add or 
remove instances. Hardcode values [A-E] for now to get it out faster.
- Create a feedback loop based on BotEvents to determine the E value.
- Create support for autoscaling per OS. Initial value would be hard coded to 
Ubuntu-12.04, eventually would support Ubuntu-12.04, Ubuntu-14.04 and 
Windows-2008ServerR2-SP1.
- Add ability to edit GlobalConfig to stop using hardcoded values.
- Eventually switch from cronjob to taskqueue to get sub-minute adjustment.

[1] https://cloud.google.com/compute/docs/autoscaler/v1beta2/

Original issue reported on code.google.com by maruel@chromium.org on 10 Dec 2014 at 7:16

GoogleCodeExporter commented 9 years ago
Are planning to put it in Swarming service? LUCI design has a separate 
component that manages machine allocation.

Original comment by vadimsh@chromium.org on 10 Dec 2014 at 7:40

GoogleCodeExporter commented 9 years ago
Due to issues with the autoscale, I'll have to use the instance group manager 
API instead. Rationale will be in the CL. In short, with autoscaler, you don't 
control which VM gets killed and they are forcibly killed, even if a task would 
be running on it. The IGM API looks applicable as-is and gives the necessary 
control to decide which VM exactly gets killed. The important APIs are 
"deleteInstances" to remove instances and "resize" to add instances.
https://cloud.google.com/compute/docs/instance-groups/manager/v1beta2/

Original comment by maruel@chromium.org on 16 Jan 2015 at 4:33

GoogleCodeExporter commented 9 years ago

Original comment by maruel@chromium.org on 16 Jan 2015 at 4:39

GoogleCodeExporter commented 9 years ago
After all we'll be able to use both.

Original comment by maruel@chromium.org on 29 Jan 2015 at 3:34