Open rantoniuk opened 5 months ago
@rantoniuk Good afternoon. Thanks for opening the issue. The error is perhaps thrown here. Please refer to section Clusters in Amazon ECS Construct Library README. It mentions that To use LaunchTemplate with AsgCapacityProvider, make sure to specify the userData in the LaunchTemplate
. Does the error goes away once you explicitly specify userData
in 2nd LaunchTemplate (as you did in the 1st LaunchTemplate)?
We also have an open issue https://github.com/aws/aws-cdk/issues/26035#issuecomment-1600839939 to improve error messaging in case user data is missing from launch template, however, don't have ETA as of now.
Thanks, Ashish
Yes.
If you look at the stack trace, it fails at this method:
AutoScalingGroup.addUserData
message: The provided launch template does not expose its user data.
And if you check here:
If launchTemplate is provided, it has to have userData
attribute.
Looking at your launchTemplateInf1
obviously it's missing the userData:
const launchTemplateInf1 = new ec2.LaunchTemplate(this, 'EcsClusterInf1', {
machineImage: ec2.MachineImage.genericLinux({
// aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2023/neuron/recommended
'us-west-2': 'ami-00a3a4671e9889e76',
}),
instanceType: new ec2.InstanceType('inf1.2xlarge'),
role: ltRole,
securityGroup: gpuinstanceSecurityGroup,
// blockDevices: [rootVolume],
requireImdsv2: true,
});
Yes, I confirm that fixes the issue:
const userDataInf1= ec2.UserData.forLinux();
// GPU EC2 Launch Template
const launchTemplateInf1 = new ec2.LaunchTemplate(this, 'EcsClusterInf1', {
machineImage:
ec2.MachineImage.fromSsmParameter(
'/aws/service/ecs/optimized-ami/amazon-linux-2023/neuron/recommended/image_id',
),
instanceType: new ec2.InstanceType('inf1.2xlarge'),
role: ltRole,
userData: userDataInf1,
securityGroup: gpuinstanceSecurityGroup,
// blockDevices: [rootVolume],
requireImdsv2: true,
});
However let me ask a follow-up questions then:
Is this a Cloudformation requirement or CDK requirement? If the latter, then I would say that instead of README, CDK should automatically add ec2.UserData.forLinux()
unless otherwise defined.
Unrelated to the initial issue, but when I tried to use:
machineImage: ec2.MachineImage.genericLinux({
machineImage:
ec2.MachineImage.fromSsmParameter(
'/aws/service/ecs/optimized-ami/amazon-linux-2023/neuron/recommended',
),
}),
then Cloudformation complained that it can't find imageId. I had to use an undocumented suffix, so '/aws/service/ecs/optimized-ami/amazon-linux-2023/neuron/recommended/image_id'
- maybe something to be added to the documentation directly.
Describe the bug
The code below works perfectly fine until the line
----- inf1
, so with onegpuCapacityProvider
. When trying to add additionalinf1CP
capacity provider, with a new LaunchTemplate that does not mention anything about UserData, it errors out oncdk diff
with:which is specifically caused by this line:
Expected Behavior
-
Current Behavior
-
Reproduction Steps
-
Possible Solution
No response
Additional Information/Context
No response
CDK CLI Version
2.146.0 (build b368c78)
Framework Version
No response
Node.js Version
v20.13.1
OS
MacOS
Language
TypeScript
Language Version
"typescript": "~5.2.0"
Other information
No response