kubernetes / autoscaler

Autoscaling components for Kubernetes
Apache License 2.0
8.08k stars 3.97k forks source link

Implement DigitalOcean cloud provider #254

Closed klausenbusk closed 5 years ago

klausenbusk commented 7 years ago

I'm not exactly sure how to implement this, but I think the easiest way would be creating droplet from a snapshot already configured to join the existing cluster.

mwielgus commented 7 years ago

cc: @andrewsykim

andrewsykim commented 7 years ago

/assign andrewsykim

andrewsykim commented 7 years ago

@mwielgus thanks for the ping

@klausenbusk feel free to start work on this issue! Here are some ideas I have so far:

Both these solutions should have ways to reference snapshots

klausenbusk commented 7 years ago

So what we need for creating a new droplet is:

That should be durable with a CRD, we then just need some nodeGroup -> CRD mapping logic. Every nodeGroup should have it own config, although it could inherit some default from a default CRD (like ssh keys and droplet size, even snapshot ID (snapshot name should work across regions I think)).

@andrewsykim what do you think?

andrewsykim commented 7 years ago

You'll probably want user data too, but in general that seems like the right direction to me. I would even consider having a separate CRD for droplet and droplet groups, but that's an implementation detail to address later.

klausenbusk commented 6 years ago

Instead of using snapshot, we could let the autoscaler create the droplet and then let a external script initialize the droplet. I think it could make sense, as their is a "million ways" to setup k8s (bootkube, kubeadm to name a few)..

Autoscaler -> IncreaseSize -> Create Droplet -> HTTP POST (ip, ssh key) to another pod..

JorgeCeja commented 6 years ago

Any updates anyone? I am using stack point's auto-scaler and it is working just fine. I am wondering if they are using a fork of this repo. I would appreciate some input on it and any updates on how to help. Thanks!

klausenbusk commented 6 years ago

@JorgeCeja They use a fork: https://github.com/StackPointCloud/autoscaler/tree/stackpointio/cluster-autoscaler/cloudprovider/spc which use SPC API to create/delete droplets.

JorgeCeja commented 6 years ago

Nice, Thanks! I guess I'll be stuck with SPC until this gets resolved. In the meantime, I will give it a shot and see how far I can implement it. If it seems too long, I am willing to open a bounty!

igauravsehrawat commented 6 years ago

@JorgeCeja Quick question: Have you been able to scale up digital ocean using auto scaler? When I use SPC with auto scaler solution. I get an error during initialization Error installing node_autoscaler: Failed to set up autoscaler, cannot get machine specs.

Did/have you encountered this kind of problem with Digital Ocean?

Thanks

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 6 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

kamushadenes commented 6 years ago

:+1: would love to use this with my Rancher 2 cluster

kamushadenes commented 6 years ago

/remove-lifecycle rotten

scruplelesswizard commented 6 years ago

If someone is looking to implement this they will likely want to look at leveraging cluster-api. There is already a DigitalOcean provider available, and it makes scaling nodes trivial (e.g. you can run the command kubectl scale machineset <a machineset name> --replicas 5 to scale your cluster to 5 nodes).

MaciekPytel commented 6 years ago

There is some effort to implement Cluster API support in CA: https://github.com/kubernetes/enhancements/issues/609. The main issue as of now is the fact that CA absolutely needs to be able to delete a specific machine, not just scale down to a given number of replicas. There is an ongoing discussion on how to extend Cluster API to support this.

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 5 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 5 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 5 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes/autoscaler/issues/254#issuecomment-477593030): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
andrewsykim commented 5 years ago

cc @timoreimann

fatih commented 5 years ago

Hi,

I'm going to look into adding autoscaler support to DigitalOcean. Is there a way we can reopen this issue again? Just want to make sure people who follow this issue are getting updated at where we are.

Thanks

andrewsykim commented 5 years ago

/reopen

k8s-ci-robot commented 5 years ago

@andrewsykim: Reopened this issue.

In response to [this](https://github.com/kubernetes/autoscaler/issues/254#issuecomment-511810507): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
dave08 commented 5 years ago

But now, instead of the proposed above solution of CRD for node template, you'll be doing it with node pools, right @fatih? If so, it would be nice to have node pools that have 0 nodes, but are configured to auto-scale, then using node labels and affinities it could know which pool to use... I currently have a use case that I need very powerful nodes for certain CI tasks that I don't want to have running all the time.

fatih commented 5 years ago

@dave08 it'll be probably tightly integrated with our node-pools indeed. I'm still in investigating on how to implement it. I'll post here occasionally with my updates. If I have working version you'll be able to test it and then we can figure out what to improve on our end.

dave08 commented 5 years ago

By the way, I also think pools should probably also have minSize and maxSize when auto-scaling is enabled... @fatih

dave08 commented 5 years ago

@fatih When will this actually be released in our DOKS clusters? Does it depend on the k8s version deployed? Thanks alot for the work :+1: !

fatih commented 5 years ago

@dave08 We're now planning to incorporate this into our new base images. We're still working on it so I can't give a time right now.

When will this actually be released in our DOKS clusters?

Yes! Either that or you'll be able to install it for an existing cluster afterwards

Does it depend on the k8s version deployed?

We're planning to release it beginning with v1.15.x versions. It's still in the early phases so we don't know how it'll look like in the end. We're going to update this issue or let people know once it's finished.

timoreimann commented 5 years ago

@dave08 we are going to use digitalocean/DOKS#5 to track the integration effort. Feel free to subscribe to that issue to be notified of any progress made.