aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.83k stars 962 forks source link

Supporting AWS Launch Template #3369

Open DanielJuravski opened 1 year ago

DanielJuravski commented 1 year ago

Tell us about your request

For creating a node template, I have to use AWSNodeTemplate resource. The thing is that this template is extremely concise and doesn't support many fields that the 'classic' AWS launch template does. I know Karpenter did support AWS launch template (why not anymore?), but currently I can't specify all my node's requirements in AWSNodeTemplate. Is there another way/WA to set my launch template id?

Tnx.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Are you currently working around this issue?

No

Additional Context

3127

Attachments

No response

Community Note

bwagner5 commented 1 year ago

Thanks for opening this! We're trying to make sure that all the things you'd need to configure in a LaunchTemplate are included in the AWSNodeTemplate. EFA is on our radar from #3127 . Curious what the use-case is for the multiple network interfaces? Generally the CNI would create interfaces for pod networking. Do you require multiple node interfaces?

DanielJuravski commented 1 year ago

@bwagner5 I use dl1.24xlarge instance type which supports up to 4 EFA network interfaces for better performance (machine learning training processes). Any ETA for https://github.com/aws/karpenter/issues/3127?

bwagner5 commented 1 year ago

I don't have an ETA for EFA support. I think we'd also need to support placement groups within Karpenter to support EFAs as well.

snorlaX-sleeps commented 1 year ago

@DanielJuravski - afaik launch-templates are currently still supported? we are still using them anyway

DanielJuravski commented 1 year ago

@snorlaX-sleeps which Karpenter version do you use?

snorlaX-sleeps commented 1 year ago

@snorlaX-sleeps which Karpenter version do you use?

Currently supporting 0.22.1

DanielJuravski commented 1 year ago

@snorlaX-sleeps Can you attach an example of using launch templates?

snorlaX-sleeps commented 1 year ago
  {{- if .Values.aws_node_template_name }}
  providerRef:
    name: {{ .Values.aws_node_template_name }}
  {{- else }}
  provider:
    launchTemplate: {{ .Values.launch_template_name }}
    subnetSelector:
      kubernetes.io/cluster/{{ .Values.cluster_name }}: '*'
      Service: {{ .Values.subnet_type }}
    {{- if gt (len .Values.subnet_names) 0 }}
      Name: {{ .Values.subnet_names }}
    {{- end }}
  {{- end }}

where providerRef is for AWSNodeTemplates and provider is for launch-templates

DanielJuravski commented 1 year ago

@snorlaX-sleeps I can't find it here https://github.com/aws/karpenter/tree/v0.22.1/charts, can you send a link to that yaml above?

snorlaX-sleeps commented 1 year ago

@DanielJuravski the current docs only go as far back as 0.22, when AWS Node Templates are the default Here's the provider section in the CRD Here it is in action for some really old docs: 0.12.1 (I just randomly picked an older version)

snorlaX-sleeps commented 1 year ago

The older version allowed you to pre-define a launch-template, or create one in the provisioner - obviously creating one in the provisioner has been replaced with AWS Node Templates. Heres some more examples I can't link you where I pulled that example YAML from, that is from our own internal provisioner Helm Chart, however we do use that in production for many clusters - you have to pass in a launch-template and tell Karpenter how to select subnets, functionality also moved to AWS Node Templates

DaspawnW commented 1 year ago

Hi,

We came around this as well. We have right now an existing Kubernetes Cluster where we can easily spin up new nodes via a Launch Template via Terraform. Now we would have to provision parts of our infrastructure via Terraform and other parts of it via some Kubernetes Yaml files.

It would be really great if this would be supported as it allows to go with Terraform for IAC and simply reference in Kubernetes the Launch Template.

In general it also would mean as soon as AWS adds some new features to Launch Template all the time Karpenter also has to be updated?

kedmison commented 1 year ago

Curious what the use-case is for the multiple network interfaces? Generally the CNI would create interfaces for pod networking. Do you require multiple node interfaces?

Sorry for jumping in mid-thread, but this question goes exactly to my use case. Telco network workloads need Multus as a meta-CNI to allow multiple interfaces to be connected to the nodes that Karpenter creates. Telco workloads are specialized and need access to multiple network interfaces, or to SR-IOV capabilities of network interfaces.

jadnaim commented 1 year ago

@bwagner5 to build on what @kedmison said. Telco workloads need multiple interfaces (2 as a minimum) for most mobile network workloads. As an example if you consider the 5G core network, the AMF, SMF and UPF will typically need multiple interfaces as they are part of multiple networks for example the AMF needs an interface for N1/N2 and at least one for N8/N11/N12/N14/N17/N20/N22 (to name a few :)), it's similar for the SMF. The UPF would need typically at least 3 interfaces in 3 different network (N3, N4 and N6 with a potential of N9 as well). All to say that having the ability to launch worker nodes and attach multiple interfaces per pod (typically using Multus) would be an amazing feature in Karpenter. Is there a way to do that today? Or a workaround? I use a lambda function developed by someone else in AWS to recognize ASG create instance event in an EKS cluster and attach multus interfaces to it, but having a more native solution to that in a cluster autoscaler like Karpenter would really set it apart.

snorlaX-sleeps commented 1 year ago

@DaspawnW - I am not sure I understand, Karpenter still supports using launch templates so what issue are you facing?

phantasm66 commented 1 year ago

@DaspawnW - I am not sure I understand, Karpenter still supports using launch templates so what issue are you facing?

It does not. It was removed. See: https://github.com/aws/karpenter/blob/d1d1371ae2e1552b8fdded7d343bf24ea18bee31/designs/v1beta1-full-changelist.md#remove-speclaunchtemplate

snorlaX-sleeps commented 1 year ago

Hi @phantasm66 - afaik that is still a WIP / design document for moving the provisioner CRD API from alpha into beta, the CRD deployed / supported as part of the Helm Chart still uses v1alpha5 - See the CRD here The provider field still exists here We are using v0.29.0 and still have launch-templates in some clusters, but I do not know when the beta API is scheduled for release. For context our Provisioner deployment YAML looks like the one below - this is part of a Helm Chart we have for it (basically the whole Helm Chart):

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: {{ .Values.name }}
  namespace: {{.Values.namespace }}
spec:
  {{- if .Values.enable_consolidation }}
  consolidation:
    enabled: true
  {{- else if .Values.enable_node_expiration }}
  ttlSecondsUntilExpired: {{ .Values.ttl_seconds_until_expired }}
  {{- else }}
  ttlSecondsAfterEmpty: {{ .Values.ttl_seconds_after_empty }}
  {{- end }}
  {{- if gt (len .Values.annotations) 0 }}
  annotations:
{{ toYaml .Values.annotations | indent 4 }}
  {{- end }}
  labels:
    env: {{ .Values.env_label }}
    {{- if gt (len .Values.additional_labels) 0 }}
{{ toYaml .Values.additional_labels | indent 4 }}
    {{- end }}
    {{- if .Values.enable_pod_sgs }}
    vpc.amazonaws.com/has-trunk-attached: "false"
    {{- end }}
  requirements:
{{ toYaml .Values.requirements | indent 2 }}
  {{- if .Values.exclude_gpu_types }}
  limits:
    resources:
      nvidia.com/gpu: 0
      amd.com/gpu: 0
      aws.amazon.com/neuron: 0
  {{- end }}
  {{- if gt (len .Values.taints) 0 }}
  taints:
{{ toYaml .Values.taints | indent 2 }}
  {{- end }}
  {{- if .Values.aws_node_template_name }}
  providerRef:
    name: {{ .Values.aws_node_template_name }}
  {{- else }}
  provider:
    launchTemplate: {{ .Values.launch_template_name }}
    subnetSelector:
      kubernetes.io/cluster/{{ .Values.cluster_name }}: '*'
      Service: {{ .Values.subnet_type }}
    {{- if gt (len .Values.subnet_names) 0 }}
      Name: {{ .Values.subnet_names }}
    {{- end }}
  {{- end }}
benjimin commented 9 months ago

Launch templates and ASGs support using a SSM Parameter name for the AMI field. This makes it easy to engineer systems to automate new AMI releases (and rollbacks), for example there is already an Amazon SSM Parameter which identifies the current recommended version for the stock EKS node AMI that is available in a particular region. (AWS have even promoted using AWS eventbridge and imagebuilder, their proprietary Packer alternative, to have SSM Parameter updates trigger rebuilds of a custom derived AMI..)

If karpenter doesn't support launch templates (and doesn't use ASGs) then does it permit specifying the AMI version indirectly (via an SSM parameter), or have some other mechanism to dynamically update the AMI without explicitly reconfiguring the karpenter deployment for each AMI release?

jmdeal commented 9 months ago

Currently Karpenter has AMI Selector terms. Karpenter will discover all AMIs which match the terms and select the latest one. There is an open request for custom SSM alias support (#3657) but it is not currently supported today.

jojonium commented 8 months ago

We have a use case where we have a launch template with a lengthy userdata script managed by Terraform. We used to be able to use v1alpha5 Provisioner's spec.provider.launchTemplate but it seems like the option to use an external launch template has been removed in the new NodePool/EC2NodeClass kinds.

I would like to see the return of an option to use unmanaged launch templates, or at least some way to reference an externally managed userdata script, instead of needing to include the entire thing as an inline string in the NodePool.

maxsargentdev commented 6 months ago

I would like to be able to disable hyperthreading on a node launched by karpenter, via the ec2nodeclass resource.

Using terraform you can create AWS launch templates that disable this feature:

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/launch_template#cpu-options

I have achieved this previously with karpenter using a custom launch template and a provisioner.

I am reposting this from another issue as suggested by a maintainer, plus one for this work šŸ‘

mlschindler commented 4 months ago

We also have a use case in our org for supporting Launch Templates which enable CPU Options, specifically AMD-SEV-SNP for Attestation/Memory Encryption.

BigValen commented 3 months ago

We use launch templates for a number of things, like the cluster join token, which you need an external solution for, if you aren't running in EKS.

scotthesterberg commented 1 month ago

The drop of being able to support providing a launch template has broken things for us as well. Our AWS region does not allow configuring LaunchTemplate metadataOptions. Previously you could provide metdadataOptions: {} under spec in AWSNodeTemplates which fixed this issue. This is no longer possible so Karpenter is completely broken. All AWS Regions either need to be fully supported (tall order) or restoration of additional customization needs to be added to EC2NodeClass. I recommend those who can to pressure their AWS support reps and hopefully we can get through to the karpenter project owners.

adawalli commented 1 month ago

@bwagner5 - any comments on this? This also is an issue for us as there are certain features like AMD-SEV-SNP nodes that we can't use with Karpenter right now