aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.67k stars 3.92k forks source link

(aws-eks): Neuron device plugin is not installed when instance type is Trainium #29131

Closed freschri closed 8 months ago

freschri commented 9 months ago

Describe the bug

if instance type is Trainium the neuron device plugin is wrongfully not installed

Expected Behavior

if instance type is Trainium the neuron device plugin is installed

Current Behavior

if instance type is Trainium the neuron device plugin is NOT installed

Reproduction Steps

use an instance of type Trainium

Possible Solution

No response

Additional Information/Context

Instance types of family Trainium have recently been added here: https://github.com/aws/aws-cdk/blame/main/packages/aws-cdk-lib/aws-ec2/lib/instance-types.ts

BUT: [packages/aws-cdk-lib/aws-eks/lib/instance-types.ts] does not include them: export const INSTANCE_TYPES = { gpu: ['p2', 'p3', 'g2', 'g3', 'g4'], inferentia: ['inf1', 'inf2'], graviton: ['a1'], graviton2: ['c6g', 'm6g', 'r6g', 't4g'], graviton3: ['c7g'], };

causing the check in packages/aws-cdk-lib/aws-eks/lib/cluster.ts to fail and the plugin not being installed:

function nodeTypeForInstanceType(instanceType: ec2.InstanceType) { return INSTANCE_TYPES.gpu.includes(instanceType.toString().substring(0, 2)) ? NodeType.GPU : INSTANCE_TYPES.inferentia.includes(instanceType.toString().substring(0, 4)) ? NodeType.INFERENTIA : NodeType.STANDARD; }

public addNodegroupCapacity(id: string, options?: NodegroupOptions): Nodegroup { const hasInferentiaInstanceType = [ options?.instanceType, ...options?.instanceTypes ?? [], ].some(i => i && nodeTypeForInstanceType(i) === NodeType.INFERENTIA); if (hasInferentiaInstanceType) { this.addNeuronDevicePlugin(); } ...

CDK CLI Version

2.128.0

Framework Version

No response

Node.js Version

v21.6.1

OS

sonoma 14.3

Language

TypeScript

Language Version

No response

Other information

No response

pahud commented 9 months ago

Yeah we could add it in the instance types. We welcome any PRs for this.

github-actions[bot] commented 8 months ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.