Closed flowinh2o closed 11 months ago
This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days
Does this example help at all? https://github.com/awslabs/data-on-eks/blob/8b756fce86c18c1a2c71b3d98d6db759c49b1904/ai-ml/trainium-inferentia/eks.tf#L179
closing with the example provided above
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Description
When attempting to setup a managed node group containing an instance type that supports multiple NICs such as a p4d.24xlarge the launch template is setup incorrectly resulting nodes being unable to start
Versions
Module version: 19.15.3
Terraform version: 1.5.0
Provider version(s): Terraform v1.5.0 on darwin_arm64
provider registry.terraform.io/hashicorp/aws v5.3.0
provider registry.terraform.io/hashicorp/cloudinit v2.3.2
provider registry.terraform.io/hashicorp/kubernetes v2.21.1
provider registry.terraform.io/hashicorp/time v0.9.1
provider registry.terraform.io/hashicorp/tls v4.0.4
Reproduction Code
I am using the https://github.com/terraform-aws-modules/terraform-aws-eks/tree/v19.15.3/examples/eks_managed_node_group and have replaced all nodes groups with this config
Steps to reproduce the behavior:
Run the example above and then try and scale up the node group.
Expected behavior
Instance should be able to be start up.
Actual behavior
Unable to launch an instance due to incorrect NIC configurations in the launch config
Additional context
Here is a screen shot of what the network cards looks like with the incorrect index
And for reference here is what a working configuration looks like using eksctl that supports EFA and multiple NICs.