Azure / ACS

Azure Container Service - Bug Tracker + Announcements
65 stars 27 forks source link

VMSS/auto-scaling support for Kubernetes on ACS #28

Open jamesbannan opened 7 years ago

jamesbannan commented 7 years ago

Hi, At the moment, it appears that VMSS is not supported for Kubernetes nodes on ACS, only Availability Sets. This means that auto-scaling is not an option, and any node scaling needs to be done manually.
I took a look at ACS-Engine, but while that supports VMSS, it also does not support it for Kubernetes. So, I'm guessing that this is a limitation of the core ACS platform, is this correct? In which case, it would be great to have VMSS support - I am dealing with customers who are moving their containers from AWS to Azure, but the lack of auto-scaling is a blocker. Thanks, James

abdelhegazi commented 7 years ago

Thanks James,

Like wise, I guess this is a big blocker for us too, not sure if MS are aware of how serious this is, anyways thanks for having a look.

Abdel

JackQuincy commented 7 years ago

Hi @jamesbannan and @abdelhegazi , Sorry for the slow response. We can't use VMSS on Kubernetes due to it not supporting required scenarios, namely PVs. Trust me I would love to. We do support scaling up and down the agent pool via PUT api which can be accessed via the cli "az acs scale" command, a template deployment, or the portal. An autoscaler could be written on top of the cli or template deployments. I would suggest the auto-scaler waits for ongoing deployments to finish before scheduling another one and that it only schedules one if there is a change in the desired number of vms.

We have told the VMSS team about the needed scenarios and the scenarios are on their backlog, but I don't know when we will be able to switch to VMSS instead of VMs. Thanks, Jack

jamesbannan commented 7 years ago

Thanks @JackQuincy for the information - much appreciated. I have used the "az acs scale" command and it works well. Have you seen any mechanism by which the command can be triggered based on metric thresholds? For example, using an external performance monitor to monitor the available CPU and RAM of the K8s nodes, and then triggering the "scale" command once a pre-defined threshold is met?

Thanks, James

JackQuincy commented 7 years ago

I haven't seen one on cpu and memory but I've seen them be based on things queued up in ACS. Here is an example https://github.com/wbuchwalter/Kubernetes-acs-engine-autoscaler , it operates on ACS-engine instead of acs but it does look at current workload to decide what to do next. That repo used to do ACS, but he wasn't doing it how I suggested above

I would suggest the auto-scaler waits for ongoing deployments to finish before scheduling another one and that it only schedules one if there is a change in the desired number of vms. That together with the fact ACS can't run operations on your VMs outside of provisioning currently, so we don't drain and cordon etc. Got him to decide to not support ACS.

A couple more details. az acs scale does not cordon and drain a node. As we don't have ssh or similar access to the cluster. which is why this repo moved away from supporting ACS.

shayansarkar commented 6 years ago

While this isn't exactly a scaling the cluster solution, I ran across this preview feature

https://github.com/Azure/aci-connector-k8s

It looks like you can connect your cluster to the Azure Container Instances service and run pods in there if you need to temporarily scale up your system. Is this a feature that is going to be finalized any time soon?

JackQuincy commented 6 years ago

Another team runs that repository so I'm not sure of their roadmap/timeline etc. I'd ask with an issue on their repository.