Closed jagga13 closed 3 months ago
I see the following corresponding error in the ssm logs within the instance that might be a clue:
2024-07-18 01:31:08 INFO [ssm-agent-worker] [MessageService] [MGSInteractor] Got reply msg Id 95b23bb7-8d6b-45e2-825c-7ff1aedae581 for RunCommandResult aws.ssm.77068c10-b575-4d48-bd5b-69f726df3fdf.i-0ee71b5fa4a9c568b, starting reply thread
2024-07-18 01:31:08 INFO [ssm-agent-worker] [MessageService] [MGSInteractor] Got reply msg Id 4de7ad92-7487-42b1-8849-03c78b9e7c41 for RunCommandResult aws.ssm.77068c10-b575-4d48-bd5b-69f726df3fdf.i-0ee71b5fa4a9c568b, starting reply thread
2024-07-18 01:31:08 INFO [ssm-agent-worker] [MessageService] [MGSInteractor] started reply processing - 4de7ad92-7487-42b1-8849-03c78b9e7c41
2024-07-18 01:31:08 INFO [ssm-agent-worker] [MessageService] [MGSInteractor] Sending reply {
"additionalInfo": {
"agent": {
"lang": "en-US",
"name": "amazon-ssm-agent",
"os": "",
"osver": "1",
"ver": ""
},
"dateTime": "2024-07-18T01:31:08.161Z",
"runId": "",
"runtimeStatusCounts": {
"Failed": 1
}
},
"documentStatus": "Failed",
"documentTraceOutput": "",
"runtimeStatus": {
"aws:runShellScript": {
"status": "Failed",
"code": 1,
"name": "aws:runShellScript",
"output": "Waiting for Cloud-init to initialize ...\nURL 'https://ec2imagebuilder-toe-us-west-2-prod.s3.us-west-2.amazonaws.com/bootstrap_scripts/bootstrap.sh' returned HTTP status '200'\n/var/lib/amazon/ssm/i-0ee71b5fa4a9c568b/document/orchestration/77068c10-b575-4d48-bd5b-69f726df3fdf/awsrunShellScript/0.awsrunShellScript/_script.sh: line 62: /tmp/imagebuilder/TaskOrchestratorAndExecutor/bootstrap.sh: Permission denied\n{\"failureMessage\":\"Unable to bootstrap TOE\"}\n\n----------ERROR-------\nfailed to run commands: exit status 1",
"startDateTime": "2024-07-18T01:31:07.755Z",
"endDateTime": "2024-07-18T01:31:08.160Z",
"outputS3BucketName": "",
"outputS3KeyPrefix": "",
"stepName": "",
"standardOutput": "Waiting for Cloud-init to initialize ...\nURL 'https://ec2imagebuilder-toe-us-west-2-prod.s3.us-west-2.amazonaws.com/bootstrap_scripts/bootstrap.sh' returned HTTP status '200'\n/var/lib/amazon/ssm/i-0ee71b5fa4a9c568b/document/orchestration/77068c10-b575-4d48-bd5b-69f726df3fdf/awsrunShellScript/0.awsrunShellScript/_script.sh: line 62: /tmp/imagebuilder/TaskOrchestratorAndExecutor/bootstrap.sh: Permission denied\n{\"failureMessage\":\"Unable to bootstrap TOE\"}\n",
"standardError": "failed to run commands: exit status 1"
}
}
}
2024-07-18 01:31:08 INFO [ssm-agent-worker] [MessageService] [MGSInteractor] successfully sent reply message id: 4de7ad92-7487-42b1-8849-03c78b9e7c41
2024-07-18 01:31:08 INFO [ssm-agent-worker] [MessageService] [MGSInteractor] started reply processing - 95b23bb7-8d6b-45e2-825c-7ff1aedae581
2024-07-18 01:31:11 INFO [ssm-document-worker] [77068c10-b575-4d48-bd5b-69f726df3fdf] Stop the cloudwatchlogs publisher
2024-07-18 01:31:08 INFO [ssm-agent-worker] [MessageService] [MGSInteractor] Sending reply {
Please disregard. This turned out to be a documented issue with /tmp being mounted with the noexec option. After fixing tmp, I was able to build the AMI successfully.
Hello,
I am trying to create a new parallel cluster AMI based on a custom AMI. This process seems to keep failing even though I have granted it full IAM access to S3/KMS/SSM. Here are the details:
AWS ParallelCluster version 3.10.1
Image config:
I see the following errors in CloudFormation:
I see the following errors in CloudWatch:
The ec2 builder instance seems to come up in a healthy state but is terminated after this above failed step and I can't disable rollback on failure either by passing in the option since it might be too early in the build process. Any help would be appreciated!
Thanks!