kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.83k stars 4.64k forks source link

Upgrade to Karpenter 0.32 and v1beta1 #16143

Open rifelpet opened 9 months ago

rifelpet commented 9 months ago

/kind feature

1. Describe IN DETAIL the feature/behavior/change you would like to see. Karpenter v0.32 was released with a v1beta1 that has significant changes from the v1alpha APIs.

There is a migration guide that covers the new CRDs.

2. Feel free to provide a design supporting your feature request.

It looks like all CRD API fields used in kops' template have a 1:1 translation to new fields. At the very least we'll need to enable pruning to cleanup the old custom resources. I'm not sure if pruning will work for both custom resources and their CRD.

The aws.enableENILimitedPodDensity that we currently set has been removed:

The aws.enablePodENI was dropped since Karpenter will now always assume that vpc.amazonaws.com/pod-eni resource exists. The aws.enableENILimitedPodDensity was dropped since you can now override the --max-pods value for kubelet in the spec.kubelet.maxPods for NodeClaims or NodeClaimTemplates

Its not clear what that means exactly and whether kops becomes responsible for tracking the max pods for each instance type.

rifelpet commented 9 months ago

Pruning the old CRDs will be a challenge because normally we skip CRD pruning:

https://github.com/kubernetes/kops/blob/master/upup/pkg/fi/cloudup/bootstrapchannelbuilder/pruning.go

rifelpet commented 9 months ago

/kind office-hours

rifelpet commented 9 months ago

Decision from office hours:

  1. Upgrade to v0.31.3 (the last pre-beta version that can be rolled back to if the beta upgrade fails), cherrypick to kops 1.28
  2. Upgrade to latest karpenter, include both alpha and beta CRDs. Mention in kops 1.29 release notes for karpenter users to first upgrade to the kops 1.28 release that includes v0.31.3 before upgrading to kops 1.29

We canoptionally migrate our manifest template to use the new custom resources or delay this until a later kops release. Karpenter claims it supports both alpha and beta APIs for now but will drop alpha at some point in the future.

rifelpet commented 9 months ago

Kops depends on using externally-managed LaunchTemplates in Karpenter's AWSNodeTemplate (v1alpha1) or EC2NodeClass (v1beta1):

https://github.com/kubernetes/kops/blob/62e2d5ac7a979c365796f52801a61034fe1e9cbf/upup/models/cloudup/resources/addons/karpenter.sh/k8s-1.19.yaml.template#L1799-L1807

This allows kops to manage the LaunchTemplates based on instance group definitions and provide those to Karpenter.

Support for externally-managed LaunchTemplates has been removed so we'll need to decide how to proceed.

@olemarkus you had strong opinions about this originally, any ideas?

olemarkus commented 9 months ago

Has it actually been removed now? Sad since none of the limitations mentioned in the RFC applies to clusters maintained by installers such as kOps.

In order to support karpenter-managed launch templates we need to inject the kOps user data into the Karpenter CRs and then leave all of the node lifecycle up to Karpenter . I am guessing for example rolling updates need to ignore karpenter IGs and rather rely on karpenter's mechanisms for that.

rifelpet commented 9 months ago

Has it actually been removed now? Sad since none of the limitations mentioned in the RFC applies to clusters maintained by installers such as kOps.

The docs mention that only v0.32.X supports both apiVersions and CRDs:

Having different Kind names for v1alpha5 and v1beta1 allows them to coexist for the same Karpenter controller for v0.32.x.

All alpha references have been removed in v0.33.0: https://github.com/kubernetes-sigs/karpenter/pull/840

In order to support karpenter-managed launch templates we need to inject the kOps user data into the Karpenter CRs and then leave all of the node lifecycle up to Karpenter . I am guessing for example rolling updates need to ignore karpenter IGs and rather rely on karpenter's mechanisms for that.

That makes sense 👍🏻

evs-ops commented 8 months ago

Any news or ETA when this will be supported?

douglasquintanilha commented 6 months ago

Do we already have a definition from kops maintainers if eventually it will support karpenter v0.33.0+? Or is still under discussion if it's even possible / worth it to handle all the necessary changes needed after karpenter dropped the unmanaged launch templates option?

k8s-triage-robot commented 3 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

rifelpet commented 2 months ago

/remove-lifecycle rotten