kube-hetzner / terraform-hcloud-kube-hetzner

Optimized and Maintenance-free Kubernetes on Hetzner Cloud in one command!
MIT License
2.24k stars 347 forks source link

Use with cluster autoscaler #235

Closed BlakeB415 closed 1 year ago

BlakeB415 commented 2 years ago

How would this be used with cluster autoscaler? Does this module support cluster autoscaler? If so, how would I configure it?

mysticaltech commented 2 years ago

Hey @BlakeB415, we do not support cluster autoscaler yet! PRs are always welcome!

p4block commented 2 years ago

Chipping in on the matter, for that I've been testing Kubeone's Kubermatic, it will get you a cluster on Hetzner with a Machine Controller able to scale the number of physical nodes like you scale a Deployment.

That said I very much prefer this project's simplicity and all the goodies it has, so seeing that functionality here would be amazing.

Maybe it is possible to install their machine controller on a cluster provisioned with this project, haven't tried yet.

JustinGuese commented 2 years ago

yes, an autoscaling cluster would be the only missing link and it's perfect in my opinion... Idk how it works, but could the masternode contain/create a limited usage hetzner-api that triggers a scale up on >80% load or something?

probably not really secure, right?

JustinGuese commented 2 years ago

@BlakeB415 cluster-autoscaler seems to contain hetzner already https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider

so it should in theory not be too much work? I'm open to take a look if it's not too much

mysticaltech commented 2 years ago

@JustinGuese Exactly, I imagine that the Kubernetes native autoscaler would be ideal to "sense" the need to autoscale, but this project runs with Terraform, so the actual autoscaling needs to be done by changing the node count at the nodepool level, in the kube.tf file. Recently I stumbled upon this project that maybe could help https://github.com/runatlantis/atlantis.

Maybe combining the two would get us something working?! It would of course require us to run Atlantis somewhere. We could also couple that with a simple NextJS web app (or Python Streamlit app) that would allow us to manage all of that, and bake in more "intelligence".

The latter option would require a side project, we could call it KubeHetzner UI, or something similar. But all these are just suppositions, maybe there is a shorter path, through a Rancher server for instance?!

p4block commented 2 years ago

Disclaimer: The following is just my view on things and may not be the best idea

The number of non-controlplane nodes is not something Terraform is good for, that should be the job of the autoscaler/machine controller. Ideally we define the nodepools in terraform like we are doing now but then kubernetes native systems manage how many nodes per pool to provision.

The magic on how to provision the nodes without a user armed with Terraform on their computer/cd system is coded by Kubermatic's Machine controller but how it actually works escapes me. From my limited testing of Kubeone, only the control plane is provisioned with Terraform and the worker pools are all managed through the Machine Controller.

There's also this https://github.com/syself/cluster-api-provider-hetzner but it goes a step beyond and gets rid of terraform entirely, which I also need to try.

mysticaltech commented 2 years ago

@p4block It's very interesting, and we could do that by saving a worker node snapshot "image" during the deployment, and later let the autoscaler use that to deploy new worker nodes.

Exactly how to do that needs to be researched, possibly it can be done all in terraform, but at least I know it can be done via the hcloud cli.

Really worth researching more! PRs welcome :)

JustinGuese commented 2 years ago

@mysticaltech yeah with an external service for sure, but i guess we're leaving open source then...

something i could imagine with the native solution would be to somehow grab the requirements from terraform and insert them in the end, i guess the hardest would be like @p4block said the image of a worker node...

the native scaler requires:


HCLOUD_CLOUD_INIT Base64 encoded Cloud Init yaml with commands to join the cluster, Sample [examples/cloud-init.txt for (Kubernetes 1.20.1)](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/hetzner/examples/cloud-init.txt)

HCLOUD_IMAGE Defaults to ubuntu-20.04, @see https://docs.hetzner.cloud/#images. You can also use an image ID here (e.g. 15512617), or a label selector associated with a custom snapshot (e.g. customized_ubuntu=true). The most recent snapshot will be used in the latter case.

HCLOUD_NETWORK Default empty , The name of the network that is used in the cluster , @see https://docs.hetzner.cloud/#networks

HCLOUD_FIREWALL Default empty , The name of the firewall that is used in the cluster , @see https://docs.hetzner.cloud/#firewalls

HCLOUD_SSH_KEY Default empty , This SSH Key will have access to the fresh created server, @see https://docs.hetzner.cloud/#ssh-keys

do you have an idea if we can grab these values? I mean except the image it should be cool?

mysticaltech commented 2 years ago

@JustinGuese Thanks for extracting and sharing those details. We have all these values!! The only thing missing is creating a snapshot at the end of the install and passing its ID to HCLOUD_IMAGE.

It's completely doable. But I just do not have the bandwidth right away to work on this, but I will help the best I can.

That will help get us there https://registry.terraform.io/providers/hetznercloud/hcloud/latest/docs/resources/snapshot.

mysticaltech commented 2 years ago

This problem has simplified a lot with the above findings. Just one thing ideally we would create different snapshots for each kind of agent nodepools. That way, choosing the kind of server we want to autoscale with would be easy.

mysticaltech commented 2 years ago

Folks, no one wants to give that a shot? It's not that hard, especially if we just choose to take a snapshot for the first nodepool definition. I will help but I don't have the bandwidth to do it all by myself right away.

JustinGuese commented 2 years ago

:D yeah same, I might have some time in September

mysticaltech commented 2 years ago

After thinking about the networking aspect of this, we would need to have one dedicated nodepool for autoscaling (easy peasy) that we would not extend manually. We can, for instance, add an attribute autoscaling = true and select based on the value of this, and if true select the right server network (i.e., which is, in fact, a subnet) dedicated to that autoscaling nodepool. So, of course, no more manual deploys on that nodepool; we would ignore the count attribute anyway.

So basically, deploy one node at least in that nodepool, snapshot it, and then prepare all of the details that the Hetzner autoscaler needs, as cited above by @JustinGuese. That will be used to feed the autoscaler.

That said, we would need, from the get-go, at least a max_number_nodes_autoscaler variable as to properly reserve the server_network (IPs) for up of that number in the subnet created just for the autoscaler nodepool.

(To understand the above, just see how the current nodepools are created and networked together).

HCLOUD_CLOUD_INIT Base64 encoded Cloud Init yaml with commands to join the cluster, Sample [examples/cloud-init.txt for (Kubernetes 1.20.1)](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/hetzner/examples/cloud-init.txt)

HCLOUD_IMAGE Defaults to ubuntu-20.04, @see https://docs.hetzner.cloud/#images. You can also use an image ID here (e.g. 15512617), or a label selector associated with a custom snapshot (e.g. customized_ubuntu=true). The most recent snapshot will be used in the latter case.

HCLOUD_NETWORK Default empty , The name of the network that is used in the cluster , @see https://docs.hetzner.cloud/#networks

HCLOUD_FIREWALL Default empty , The name of the firewall that is used in the cluster , @see https://docs.hetzner.cloud/#firewalls

HCLOUD_SSH_KEY Default empty , This SSH Key will have access to the fresh created server, @see https://docs.hetzner.cloud/#ssh-keys

This logic would need to live in an autoscaling.tf file, similar to agent.tf but not entirely the same.

Low-hanging fruit, folks! Pick it up, and I will help you!!

codeagencybe commented 2 years ago

Is there somebody already working on this auto scaling feature? Any progress? Stuck? Feedback?

mysticaltech commented 2 years ago

Hello @codeagencybe, I have exposed how I think it should flow above but have not yet had time to work on that feature. It probably needs just a few hours as the "path" seems obstacle free.

I can't say when I will take this on, but in the meantime, if anyone of you folks wants to give this a shot, please do so - I will be very responsive on the PR! 🙏

codeagencybe commented 2 years ago

Hello @mysticaltech If I had the knowledge to create it, I would have done it for you but unfortunately I'm still learning about this language so I can't help you with the development. But I'm happy to help in any other point where I can, testing, documenting, translating, ... If there is anything like this that you need, let me know.

Nigma1337 commented 1 year ago

Im quite interesting in looking at this, but i can't really seem to find where the variables (snapshot id, etc) would be fed into? Here's the documentation i found: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/hetzner/README.md which has this yaml: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/hetzner/examples/cluster-autoscaler-run-on-master.yaml So, would we somehow create+apply the yaml from within Terraform, or what would your thoughts be @mysticaltech

mysticaltech commented 1 year ago

@Nigma1337 Very glad to hear. You are indeed correct, it does not accept the Snapshot Id, but instead, a label used by the snapshot.

ksnip_20221012-153907

About the YAML, the easiest solution would be to add it to the templates folder, as hetzner_autoscaler_config.yaml.tpl, and load it and replaced the needed values as done for other template files in agents.tf.

Maybe the logic is simple enough to just live in agents.tf, no need to create another file as suggested above, but you see what feels best at the moment of implementation.

mysticaltech commented 1 year ago

@Nigma1337 I have just downloaded the Github mobile app, so I will be more responsive to this issue if you decide to go through this. Let's get it done!!! 🚀 🤞

Nigma1337 commented 1 year ago

Made an initial commit on my fork https://github.com/Nigma1337/terraform-hcloud-kube-hetzner/tree/autoscaling

I can't really seem to figure out how i'd do the logic of only installing the autoscaler if max nodes > 0, i've never worked on a terraform project of this size.

mysticaltech commented 1 year ago

Wonderful, @Nigma1337; it's a good start. Please do open a draft PR, so that I can contribute to it too.

Something that should be changed is the snapshoot.id, actually that is not needed or wanted; the selection is based on the unique label you give it (see above).

For the subnet logic, It's a bit delicate as the 10.0.0.0/16 CIDR is compartmentalized in a certain way for control plane and agent nodepools. Basically, control planes are 10.0..., 10.1.0..., 10.2.0..., and agents 10.255.0..., 10.254.0..., 10.253.0... So in our case, we can just take 10.255.0... for the initial support of 1 and only one autoscaling nodepool. Later on, we can expand to up to 10 autoscaling nodepools, but let's start simple first, haha. I propose contributing that logic, but please do not hesitate to try. You can find examples of how those CIDRs are calculated in control-planes.tf and agents.tf.

Also, the way I see the logic coming, best to separate it in it's own autoscaling.tf, no need for it to be either in agents.tf or main.tf.

And to simplify matters even more, let's just NOT give any definition of what that autoscaling nodepool might be. Let's just copy the definition of the last agent nodepool.

That way, we would need 3 variables only..... min_autoscaling_count, max_autscaling_count, enable_autoscaling (boolean). Please correct me if I am wrong!

This is looking good, my friend! This baby is coming soon 🚀

ifeulner commented 1 year ago

Wouldn't it be nicer so create snapshot with packer and use that snapshot then? Or use a snapshot from the first control-plane node (which is always needed)?

otavio commented 1 year ago

As far as I know Packer isn't supported in Hetzner. @ifeulner have you ever did it?

mysticaltech commented 1 year ago

As @otavio said, Packer is not supported by the Kubernetes autoscaler (I do not see it working in that context at least). And why would you want to scale your control planes? We could do that later on, but for now, scaling a particular agent-nodepool (copying its definition at least, like the first or last one) is the priority, as it would provide something that works.

@ifeulner What you seem to want to do (deploy more nodes after the initial launch), you can already do right now, just by adding nodepools, or increasing the count of already present nodepools.

ifeulner commented 1 year ago

@otavio packer works on Hetzner in general to build snapshots, just updated it for ubuntu, see this repo.

In hcloud_server you can use a snapshot, so what needs to be done is to have a snapshot for MicroOS. This snapshot could also be used then in the CRDs for the autoscaler with a corresponding cloud-init.

Or do I miss something?

ifeulner commented 1 year ago

@mysticaltech it's not about scaling the controlplanes, just creating a proper snapshot out of the first node to be used later.

mysticaltech commented 1 year ago

@ifeulner I understand but it's not needed, we can just ask Hetzner to create a snapshot of the first node in the last agent nodepool. But thanks for sharing those alternative ideas 🙏

ifeulner commented 1 year ago

That's right but doing it on the first would avoid doing the whole download-and-install procedure for the microos image for the additional nodes.

mysticaltech commented 1 year ago

@ifeulner I get this, but we must take a snapshot of a live image, as the conditions and configurations will be very similar. But noted now we know of that other option we could leverage!

mysticaltech commented 1 year ago

@BlakeB415 @JustinGuese @codeagencybe @ifeulner Thanks to the initial work of @Nigma1337, we now have a PR #352 that at least deploys successfully the autoscaler.

Heavy testing is needed and please do not hesitate to open more PRs pointing to the autoscaling branch, I will review and merge those ASAP.

I count on your cooperation, let's make this "dream" feature come true! ✨

ksnip_20221015-131440

mysticaltech commented 1 year ago

For clarity and simplificy, all that is needed for testing is to define a new variable in your kube.tf, it's max_number_nodes_autoscaler if it has a value > 0, a snapshot will be taken from the FIRST NODE, OF THE FIRST AGENT NODEPOOL.

ifeulner commented 1 year ago

Hi @mysticaltech you were faster than me ;) But I have started a version using the control_plane for snapshotting and having a dedicated cloud-init template, which allows more control over the node startup phase. Few things I mentioned:

The scaler does not do anything without load. so something like

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: hello-world
  name: hello-world
spec:
  replicas: 3
  selector:
    matchLabels:
      app: hello-world
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
      - image: rancher/hello-world
        imagePullPolicy: Always
        name: hello-world
        ports:
        - containerPort: 80
          protocol: TCP
        resources:
          limits:
            cpu: 1000m
            memory: 1024Mi
          requests:
            cpu: 1000m
            memory: 1024Mi

would help to test...

mysticaltech commented 1 year ago

@ifeulner Wonderful, that's very good to hear! It's important that the image is taken from a agent, so that it has the correct k3s config.

Those are wonderful contributions above! About the image, I ended up using the bitnami image, which is well-optimized and always fresh!

About the CA, definitely that is a problem, happy that you propose the solution above, a PR to the autoscaling branch would be really appreciated, I will continue working on this tonight or tomorrow morning!

Thanks and keep up the good work.

mysticaltech commented 1 year ago

Talking about k3s config, I guess we will need to update this in the not yet implemented CLOUD_INIT section... To give it the correct IPs.

ifeulner commented 1 year ago

You also need to ensure that the snapshot is emptied - that's why I also prefer an as early as possible snapshot. In cloud-init a new identity has to be created and it must be ensured that there is no old / wrong data from the snapshot. I will put my stuff together and will later create a merge request - maybe we can have then a good result out of the two approaches...

mysticaltech commented 1 year ago

@ifeulner Perfect then, the ball is in your camp! 🙏

ifeulner commented 1 year ago

@mysticaltech not much time today, but here you find my latest code: https://github.com/ifeulner/kube-hetzner/tree/autoscaling

Nodes will be created by autoscaler (providing snapshot id is fine), but networking is messed up. It's not possible to provide subnets via HCLOUD_NETWORK, only networks... therefore the nodes cannot register completely yet.

mysticaltech commented 1 year ago

@ifeulner Thanks for your excellent work. It is a serious roadblock if the HCLOUD_NETWORK does not accept subnet ids.

However, please look at #352, as some fixes were added to the labels and the locations of the certs. Maybe it will help clarify the picture!

mysticaltech commented 1 year ago

Ah... @ifeulner, we could give it the network's name and just NOT allocate 10.255... to a subnet. We just leave that IP space free, and since all other IP spaces are reserved, the autoscaled machine will automatically be allocated from that "free" pool of addresses. Please try! (Along with the fixes mentioned above) 🙏

mysticaltech commented 1 year ago

Please folks, test #353 by @ifeulner 🚀✨

mysticaltech commented 1 year ago

This is launching soon; you can already fork the branch and test it!