Open laurentdroin opened 5 months ago
@laurentdroin, can you share your ClusterConfig file or CLI args used to create the cluster? Are you able to consistently reproduce this?
@cPu1 - Yes, I can reproduce consistently. Here is the command I used yesterday (I obfuscated the IDs of the private subnet and my email address):
eksctl create cluster --name ldr-test-eksctl-june-3 --region us-east-1 --version 1.29 --vpc-private-subnets subnet-XXX,subnet-XXX --tags Owner=me@address.com,blah=bleh --node-type t3.xlarge -N 3 --node-volume-size 20 --instance-prefix ldr-eksctl-june-3-
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Commenting in order to remove the stale label.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
I got this too with eksctl 0.188. Not sure if it's relevant, that I updated IAM permissions to the latest Minimum Permissions on the doc site before trying eksctl 0.188.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
I got this too with eksctl 0.188. Not sure if it's relevant, that I updated IAM permissions to the latest Minimum Permissions on the doc site before trying eksctl 0.188.
@edmonl, can you share your ClusterConfig file and the output from running the command?
I got the same issue with version 0.191
.
using EC2
with the platform Amazon Linux 2023
, and an IAM role is attached.
IAM policy
AmazonEC2ContainerRegistryFullAccess
AmazonEKSClusterPolicy
AmazonEKSFargatePodExecutionRolePolicy
AmazonEKSServicePolicy
AmazonEKSVPCResourceController
AmazonEKSWorkerNodePolicy
Inline Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Action": [
"eks:*",
"cloudformation:*",
"iam:CreateRole",
"iam:AttachRolePolicy",
"iam:PassRole",
"iam:TagRole",
"iam:DetachRolePolicy",
"iam:DeleteRole"
],
"Resource": [
"*"
]
}
]
}
My Cluster config yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: daniel-test
region: ap-northeast-1
version: "1.30"
vpc:
subnets:
private:
ap-northeast-1a:
id: "subnet-xx"
ap-northeast-1c:
id: "subnet-xx"
ap-northeast-1d:
id: "subnet-xx"
public:
ap-northeast-1a:
id: "subnet-xx"
ap-northeast-1c:
id: "subnet-xx"
ap-northeast-1d:
id: "subnet-xx"
fargateProfiles:
- name: daniel-test-fargate
selectors:
- namespace: daniel-test
And I excuse the command eksctl create cluster -f ./cluste.yaml
got this error
2024-10-09 09:19:13 [ℹ] building cluster stack "eksctl-daniel-test-cluster"
2024-10-09 09:19:13 [ℹ] deploying stack "eksctl-daniel-test-cluster"
2024-10-09 09:19:43 [ℹ] waiting for CloudFormation stack "eksctl-daniel-test-cluster"
2024-10-09 09:19:43 [✖] unexpected status "ROLLBACK_IN_PROGRESS" while waiting for CloudFormation stack "eksctl-daniel-test-cluster"
2024-10-09 09:19:43 [✖] unexpected status "ROLLBACK_IN_PROGRESS" while waiting for CloudFormation stack "eksctl-daniel-test-cluster"
2024-10-09 09:19:43 [ℹ] fetching stack events in attempt to troubleshoot the root cause of the failure
2024-10-09 09:19:43 [!] AWS::EC2::SecurityGroup/ClusterSharedNodeSecurityGroup: DELETE_IN_PROGRESS
2024-10-09 09:19:43 [!] AWS::EC2::SecurityGroup/ControlPlaneSecurityGroup: DELETE_IN_PROGRESS
2024-10-09 09:19:43 [!] AWS::IAM::Role/ServiceRole: DELETE_IN_PROGRESS
2024-10-09 09:19:43 [!] AWS::EC2::SecurityGroupIngress/IngressInterNodeGroupSG: DELETE_IN_PROGRESS
2024-10-09 09:19:43 [!] AWS::IAM::Role/FargatePodExecutionRole: DELETE_IN_PROGRESS
2024-10-09 09:19:43 [✖] AWS::EKS::Cluster/ControlPlane: CREATE_FAILED – "Attribute 'Arn' does not exist"
2024-10-09 09:19:43 [!] 1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
2024-10-09 09:19:43 [ℹ] to cleanup resources, run 'eksctl delete cluster --region=ap-northeast-1 --name=daniel-test'
2024-10-09 09:19:43 [✖] ResourceNotReady: failed waiting for successful resource state
I try to add the --cfn-disable-rollback
flag and retry the Cloud Formation stack , that is work .
Am I missing something ?
Add the permission to trust relationships like below
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::${account_id}:role/daniel-test",
"Service": [
"ec2.amazonaws.com",
"eks-fargate-pods.amazonaws.com",
"eks.amazonaws.com"
]
},
"Action": "sts:AssumeRole"
}
]
}
recreate the cluster , that's work. Maybe can show more error message detail from console output ? Thanks.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Attempting to create an EKS cluster with eksctl 0.179.0 fails.
It fails during the stack creation:
This seems to be a timing issue because if the eksctl command is run with the '--cfn-disable-rollback' flag to prevent the stack rollback, I can see that even thought the stack creation failed, the role "ServiceRole" is properly created and the ARN is available. As a matter of fact, if I "retry" the stack, it completes and creates fine.