Closed klausenbusk closed 5 years ago
cc: @andrewsykim
/assign andrewsykim
@mwielgus thanks for the ping
@klausenbusk feel free to start work on this issue! Here are some ideas I have so far:
Both these solutions should have ways to reference snapshots
So what we need for creating a new droplet is:
That should be durable with a CRD, we then just need some nodeGroup -> CRD mapping logic. Every nodeGroup should have it own config, although it could inherit some default from a default CRD (like ssh keys and droplet size, even snapshot ID (snapshot name should work across regions I think)).
@andrewsykim what do you think?
You'll probably want user data too, but in general that seems like the right direction to me. I would even consider having a separate CRD for droplet and droplet groups, but that's an implementation detail to address later.
Instead of using snapshot, we could let the autoscaler create the droplet and then let a external script initialize the droplet. I think it could make sense, as their is a "million ways" to setup k8s (bootkube, kubeadm to name a few)..
Autoscaler -> IncreaseSize -> Create Droplet -> HTTP POST (ip, ssh key) to another pod..
Any updates anyone? I am using stack point's auto-scaler and it is working just fine. I am wondering if they are using a fork of this repo. I would appreciate some input on it and any updates on how to help. Thanks!
@JorgeCeja They use a fork: https://github.com/StackPointCloud/autoscaler/tree/stackpointio/cluster-autoscaler/cloudprovider/spc which use SPC API to create/delete droplets.
Nice, Thanks! I guess I'll be stuck with SPC until this gets resolved. In the meantime, I will give it a shot and see how far I can implement it. If it seems too long, I am willing to open a bounty!
@JorgeCeja
Quick question: Have you been able to scale up digital ocean using auto scaler? When I use SPC with auto scaler
solution. I get an error during initialization Error installing node_autoscaler: Failed to set up autoscaler, cannot get machine specs
.
Did/have you encountered this kind of problem with Digital Ocean?
Thanks
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
:+1: would love to use this with my Rancher 2 cluster
/remove-lifecycle rotten
If someone is looking to implement this they will likely want to look at leveraging cluster-api. There is already a DigitalOcean provider available, and it makes scaling nodes trivial (e.g. you can run the command kubectl scale machineset <a machineset name> --replicas 5
to scale your cluster to 5 nodes).
There is some effort to implement Cluster API support in CA: https://github.com/kubernetes/enhancements/issues/609. The main issue as of now is the fact that CA absolutely needs to be able to delete a specific machine, not just scale down to a given number of replicas. There is an ongoing discussion on how to extend Cluster API to support this.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
@fejta-bot: Closing this issue.
cc @timoreimann
Hi,
I'm going to look into adding autoscaler support to DigitalOcean. Is there a way we can reopen this issue again? Just want to make sure people who follow this issue are getting updated at where we are.
Thanks
/reopen
@andrewsykim: Reopened this issue.
But now, instead of the proposed above solution of CRD for node template, you'll be doing it with node pools, right @fatih? If so, it would be nice to have node pools that have 0 nodes, but are configured to auto-scale, then using node labels and affinities it could know which pool to use... I currently have a use case that I need very powerful nodes for certain CI tasks that I don't want to have running all the time.
@dave08 it'll be probably tightly integrated with our node-pools indeed. I'm still in investigating on how to implement it. I'll post here occasionally with my updates. If I have working version you'll be able to test it and then we can figure out what to improve on our end.
By the way, I also think pools should probably also have minSize and maxSize when auto-scaling is enabled... @fatih
@fatih When will this actually be released in our DOKS clusters? Does it depend on the k8s version deployed? Thanks alot for the work :+1: !
@dave08 We're now planning to incorporate this into our new base images. We're still working on it so I can't give a time right now.
When will this actually be released in our DOKS clusters?
Yes! Either that or you'll be able to install it for an existing cluster afterwards
Does it depend on the k8s version deployed?
We're planning to release it beginning with v1.15.x versions. It's still in the early phases so we don't know how it'll look like in the end. We're going to update this issue or let people know once it's finished.
@dave08 we are going to use digitalocean/DOKS#5 to track the integration effort. Feel free to subscribe to that issue to be notified of any progress made.
I'm not exactly sure how to implement this, but I think the easiest way would be creating droplet from a snapshot already configured to join the existing cluster.