berops / claudie

Cloud-agnostic managed Kubernetes
https://docs.claudie.io/
Apache License 2.0
521 stars 34 forks source link

Feature: Faster cluster provisioning and autoscaling #1231

Open bernardhalas opened 4 months ago

bernardhalas commented 4 months ago

Motivation

A significant portion of the cluster provisioning and autoscaler execution time takes the download of various packages and binaries. There's a room to optimize this part.

Description

We could speed this up by utilizing our pre-populated images on providers that allow this. This would result in a faster ansibler and kube-eleven execution as the binaries would already be present. And for the cases when not (e.g. on providers where we can't deploy our images or on static nodes), the usual ansibler and kube-eleven flows will take care of the download.

Note 1: We should assess this approach against custom pre-baked Flatcar, Fedora Core OS or OpenSuse MicroOS images. Note 2: It would help tremendously in this task if we knew whether we can get rid of Wireguard and utilize just Cilium for bridging nodes across various networks.

Exit criteria

FYI @MiroslavRepka

Danielss89 commented 4 months ago

Sounds cool. Can Claudie check wether a node i using the image or not and figure out what to do? For example, on OneProvider i can upload images which the servers i create can use. So it might be a dedicated server in Claudie, but still use the image.

fritz-net commented 2 days ago

may an alternative for networking could be kilo -> https://kilo.squat.ai/ since as far as I remember that claudie demands the nodes to be on public IPs anyway I used kilo in a k8s cluster which I setup with kubeadmin