giantswarm / roadmap

Giant Swarm Product Roadmap
https://github.com/orgs/giantswarm/projects/273
Apache License 2.0
3 stars 0 forks source link

Flatcar Releases for CAPI Images #2978

Open njuettner opened 7 months ago

njuettner commented 7 months ago

Current state

Currently we build for each provider (CAPA/CAPZ/CAPV) our OS Images with at least 3-4 Kubernetes releases plus always taking latest Flatcar stable release.

That means depending on the build time we might create an image with a different Flatcar stable release once a new one is available.

image image

Problem

Desired state

AverageMarcus commented 7 months ago

Thanks @njuettner 👍 I've added a couple distinct tasks that can be done in isolation.

One thing I'm still not clear about though is how we decide which version and channel to build automatically. When building manually it's easy as you can provide those values but with the automated builds when a new Kubernetes release is available it's not so easy. The simple option would be latest stable Flatcar but that doesn't make sense if we're having to use a different version to deal with some issue we're seeing (e.g. the Cilium thing previously).

Aside from that, currently there's no triggering of builds when a new Flatcar version is available - is that desired? If so, what Kubernetes versions should we build? (As we have no way of tracking what versions are being used and we're very far from using the latest Kubernetes versions)

How do we want to handle when new versions of Flatcar aren't available for all providers? (E.g. Azure is notoriously slow at getting the new versions available)

calvix commented 7 months ago

Just thinking out loud. I wonder if there is some better way how handle Kubernetes versions rather than creating a completely new AMI for each version even if the only thing that changes is the Kubelet binary

for example what about having a single AMI for a specific flatcar version and for Kubernetes minor release - capa-ami-flatcar-1234.0-v1.30-gs - that would have stored kubelet binary for all patches like 1.30.0, 1.30.1, 1.30.2 and so on and during machine lunch it would just choose the right binary to use That would save us significant amount of AMIs that need to be stored.

AverageMarcus commented 7 months ago

If you're willing to build a system to create those images then sure 😅

calvix commented 7 months ago

If you're willing to build a system to create those images then sure 😅

Should not be very hard, similar to what we do with teleport, it just add a new ansible config which downloads a bunch of kubelet binaries and store them in some path, and then in cluster repo add pre-kubeadm-command which copies the right binary to the kubelet place

AverageMarcus commented 7 months ago

But we use upstream image-builder for all the building. It's very much tied to specific versions and configs. We'd need to basically replace it all with our own stuff.

calvix commented 7 months ago

But we use upstream image-builder for all the building. It's very much tied to specific versions and configs. We'd need to basically replace it all with our own stuff.

AFAIk the only version-specific thing is kubelet and kubeadm binary. So we just need to download that on top of the existing image builder, I don't think any other changes are needed as mentioned above just add a simple extra ansible module which downloads the binaries.

AverageMarcus commented 7 months ago

@njuettner based on the following:

Cost saving: cleaning up all images which are not used automatically (this will be owned by provider team or Turtles not by Tinkerers)

Would it make sense for me to migrate https://github.com/giantswarm/giantswarm/issues/26684 to your board instead? (and update it to not be CAPA-specific)

ericgraf commented 1 month ago

@njuettner FYI I added another issue to the task list https://github.com/giantswarm/roadmap/issues/3465