eksctl-io / eksctl

The official CLI for Amazon EKS
https://eksctl.io
Other
4.88k stars 1.4k forks source link

[Bug] preBootstrapCommands is not working in AL2023 #7903

Open xiangyanw opened 1 month ago

xiangyanw commented 1 month ago

What were you trying to accomplish?

I want to mount a data volume to EKS node with AL2023 by preBootstrapCommands.

What happened?

I configured preBootstrapCommands for a managed nodegroup in EKS version 1.30, but those commands were not added to the userdata.

Here is my preBootstrapCommands:

    preBootstrapCommands:
      - "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
      - "sudo mount -a"

Here is the resulting userdata in the launchtemplate:

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40

--78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40
Content-Type: text/x-shellscript
Content-Type: charset="us-ascii"

#!/bin/bash

set -o errexit
set -o pipefail
set -o nounset

touch /run/xtables.lock

--78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40--

How to reproduce it?

Use the following YAML to create a nodegroup for EKS 1.30. Execute command: eksctl create ng -f xxx.yaml

  - name: nodegroup
    instanceType: c6a.large
    minSize: 0
    desiredCapacity: 1
    maxSize: 2
    volumeSize: 30
    volumeType: 'gp3'
    privateNetworking: true
    preBootstrapCommands:
      - "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
      - "sudo mount -a"
    additionalVolumes:
      - volumeName: '/dev/xvdb' # required
        volumeSize: 50
        volumeType: 'gp3'

Logs 2024-07-29 03:13:13 [ℹ] nodegroup "xxxx-nodegroup" will use "" [AmazonLinux2023/1.30] 2024-07-29 03:13:13 [ℹ] nodegroup "nodegroup" will use "" [AmazonLinux2023/1.30] 2024-07-29 03:13:17 [ℹ] 1 existing nodegroup(s) (xxxx-nodegroup) will be excluded 2024-07-29 03:13:17 [ℹ] 1 nodegroup (nodegroup) was included (based on the include/exclude rules) 2024-07-29 03:13:17 [ℹ] will create a CloudFormation stack for each of 1 managed nodegroups in cluster "xxxx" 2024-07-29 03:13:17 [ℹ]
2 sequential tasks: { fix cluster compatibility, 1 task: { 1 task: { create managed nodegroup "nodegroup" } } } 2024-07-29 03:13:17 [ℹ] checking cluster stack for missing resources 2024-07-29 03:13:19 [ℹ] cluster stack has all required resources 2024-07-29 03:13:21 [ℹ] building managed nodegroup stack "eksctl-xxxx-nodegroup-nodegroup" 2024-07-29 03:13:22 [ℹ] deploying stack "eksctl-xxxx-nodegroup-nodegroup" 2024-07-29 03:13:22 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup" 2024-07-29 03:13:53 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup" 2024-07-29 03:14:44 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup" 2024-07-29 03:16:22 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup" 2024-07-29 03:16:22 [ℹ] no tasks 2024-07-29 03:16:22 [✔] created 0 nodegroup(s) in cluster "xxxx" 2024-07-29 03:16:22 [✔] created 1 managed nodegroup(s) in cluster "xxxx" 2024-07-29 03:16:24 [ℹ] checking security group configuration for all nodegroups 2024-07-29 03:16:24 [ℹ] all nodegroups have up-to-date cloudformation templates

Anything else we need to know? This is working as expected when I use AL2 AMI in the same cluster.

  - name: nodegroup2
    amiFamily: AmazonLinux2
    instanceType: c6a.large
    minSize: 0
    desiredCapacity: 1
    maxSize: 2
    volumeSize: 30
    volumeType: 'gp3'
    privateNetworking: true
    preBootstrapCommands:
      - "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
      - "sudo mount -a"
    additionalVolumes:
      - volumeName: '/dev/xvdb' # required
        volumeSize: 50
        volumeType: 'gp3'

Versions

eksctl version: 0.187.0
kubectl version: v1.24.0
OS: linux
cPu1 commented 1 month ago

preBootstrapCommands is not supported for AL2023 nodegroups. This validation exists for self-managed nodegroups but is missing for managed nodegroups, so create nodegroup silently ignores that field rather than failing early with an error. We'll work on a fix soon.

xiangyanw commented 1 month ago

What is the alternative if preBootstrapCommands is not supported for AL2023?

oekarlsson commented 4 weeks ago

What is the alternative if preBootstrapCommands is not supported for AL2023?

I agree, what should we use instead? The question perhaps should be: Are there any plans to create something more or less equivalent to preBootstrapCommands available in AL2023? This is the one thing that stops us from using AL2023.

OlGe404 commented 3 weeks ago

we NEED preBootstrapCommands to work because we rely on it to provide custom ca-certificates to pull container images from a private container registry

jamieavins commented 3 weeks ago

preBootstrapCommands is not supported for AL2023 nodegroups. This validation exists for self-managed nodegroups but is missing for managed nodegroups, so create nodegroup silently ignores that field rather than failing early with an error. We'll work on a fix soon.

AL2023 is now the default, so please understand this is going to affect a lot of customers without them even realizing it.