kubernetes-sigs / cluster-api-provider-aws

Kubernetes Cluster API Provider AWS provides consistent deployment and day 2 operations of "self-managed" and EKS Kubernetes clusters on AWS.
http://cluster-api-aws.sigs.k8s.io/
Apache License 2.0
646 stars 570 forks source link

Improve EKS AMI Lookups #3663

Open luthermonson opened 2 years ago

luthermonson commented 2 years ago

/kind feature

Describe the solution you'd like I would like to propose changing the AMI lookups a bit to standardize EKS and create some common terms.

Problem: with these two pull requests (one, two) there was support added for pulling from SSM the latest AMI ID for EKS optimized images. TL;DR on the SSM method is that AWS keeps these public key values up to date when they build new images and it’s a guaranteed place to find the most up to date AMI ID. Problems with this are:

Solution: The existing AMI search is perfectly adequate to find AMI IDs, this method takes some simple lookup params (template, baseos, kubeversion) and performs an AMI search and then loops over them and finds the most recent one and returns that AMI ID. This method allows for wildcards and more levers to pull in the query params than the EKS keys. So I propose the following:

Pros: Enables ubuntu-18.04 and ubuntu-20.04 support for EKS. Enables ARM64 support across the entire project as currently x86_64 is hardcoded in all DescribeImages queries. Standardizes the support BaseOS and gives them names and constants to validate against.

Cons: Deprecates some APIs that may be in use, however… the one I'm looking at is AMIReference.EKSOptimizedLookupType and it’s not even documented and in experimental EKS features.

Sample list of BaseOS across Self managed/EKS from my current WIP

Amazon2         = "amazon-2"          // capa, eks
Amazon2GPU      = "amazon-2-gpu"      // eks
Ubuntu1804      = "ubuntu-18.04"      // capa, eks
Ubuntu2004      = "ubuntu-20.04"      // capa, eks
CentOS7         = "centos-7"          // capa
FlatcarStable   = "flatcar-stable"    // capa
BottleRocket    = "bottlerocket"      // eks
Windows2019Core = "windows-2019-core" // eks
Windows2019Full = "windows-2019-full" // eks
Windows2022Core = "windows-2022-core" // eks (coming soon)
Windows2022Full = "windows-2022-full" // eks (coming soon)

and the Name templates for EKS from my WIP look like this.

// go template for AMI Name Lookup
var eksAMINameFormats = map[string]string{
    Amazon2:         "amazon-eks-node-{{.K8sVersion}}-v*", //amazon-2
    Amazon2GPU:      "",
    Ubuntu1804:      "ubuntu-eks/k8s_{{.K8sVersion}}/images/*18.04*",
    Ubuntu2004:      "ubuntu-eks/k8s_{{.K8sVersion}}/images/*20.04*",
    BottleRocket:    "bottlerocket-aws-k8s-{{.K8sVersion}}*  ",
    Windows2019Core: "Windows_Server-2019-English-Core-EKS_Optimized-{{.K8sVersion}}-*", // coming soon
    Windows2019Full: "Windows_Server-2019-English-Full-EKS_Optimized-{{.K8sVersion}}-*", // coming soon
    Windows2022Core: "Windows_Server-2022-English-Core-EKS_Optimized-{{.K8sVersion}}-*", // currently unavailable
    Windows2022Full: "Windows_Server-2022-English-Full-EKS_Optimized-{{.K8sVersion}}-*", // currently unavailable
    CentOS7:         "",                                                                 // unavailable, leave empty so ok check fails and return default, add pattern if they become available
    FlatcarStable:   "",                                                                 // unavailable, leave empty so ok check fails and return default, add pattern if they become available
}

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

luthermonson commented 2 years ago

Related Issues

Skarlso commented 2 years ago

The proposal looks reasonable to me. But I'm not great in the AMI code lookup part atm. I will take a look at the code review accordingly.

That said, I like this approach. It adds more flexibility to how we would use AMI lookup and add support for other types much easier if I understand it correctly.

richardcase commented 2 years ago

/triage accepted /priority important-soon /milestone v1.6.0 /area provider/eks

Skarlso commented 2 years ago

Heh, turns out there is an ancient issue about something like this already here https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/1869

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged. Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten