Closed StevenACoffman closed 3 years ago
@StevenACoffman we've been trying to explore this as part of #118, and #252 is currently the standing issue for this.
It should be relatively easy to add the logic here:
So the zoneId (unlike zone name) is consistent across accounts, so just avoid use1-az3
?
I would rather avoid hard-coding that in any case, but let's try to test it and see what works.
Is there any further interest in this proposal? I've found that using --zones=us-east-1a,us-east-1b,us-east-1d
just happens to be the magic incantation of AZs that works reliably for me. However, something dynamic with retry logic would be ideal.
Hi @mrichman, did you find a way to use --zones in the yaml file ? Looks like the option is not allowed in the yaml file
I added a "zones" field in the yaml file below. But I've got this error json: unknown field "zones".
https://github.com/weaveworks/eksctl/blob/master/examples/01-simple-cluster.yaml
metadata:
name: cluster-1
region: us-east-1
zones: ["us-east-1a, us-east-1b"]
Thank you for your help.
Would be nice to see this retry feature. I hit this problem once a week spinning up sandbox clusters for testing. Its quite annoying because you have to wait for the stack to rollback, then go manually delete the rolled back stack.
If I understand correctly, is it true that @StevenACoffman's trick (i.e. aws ec2 describe-reserved-instances-offerings
) resolves a different problem, i.e. not being able to create or expand ASGs in an existing cluster? And that this trick would not necessarily resolve the issue @rdubya16 brings up regarding not being able to create the cluster in the first place (which I also am seeing)? Here is my error:
AWS::EKS::Cluster/ControlPlane: CREATE_FAILED – "Cannot create cluster 'my-cluster' because us-east-1e, the targeted availability zone, does not currently have sufficient capacity to support the cluster. Retry and choose from these availability zones: us-east-1a, us-east-1b, us-east-1c, us-east-1d, us-east-1f (Service: AmazonEKS; Status Code: 400; Error Code: UnsupportedAvailabilityZoneException; Request ID: 9783591e-a9f4-4511-b142-fcd8ba0f08a7)"
(I did not specify availability zones when calling eksctl create cluster
)
Is there currently a workaround for the cluster creation issue?
Hi,
I am still facing similar issues.
cloud_user:~/eks $ eksctl version
[ℹ] version.Info{BuiltAt:"", GitCommit:"", GitTag:"0.11.1"}
cloud_user:~/eks $ cluster.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: basic-cluster
region: us-east-1
nodeGroups:
- name: ng-1
instanceType: t2.micro
desiredCapacity: 2
- name: ng-2
instanceType: t2.micro
desiredCapacity: 2
cloud_user:~/eks $ eksctl create cluster -f cluster.yaml
[ℹ] eksctl version 0.11.1
[ℹ] using region us-east-1
[ℹ] setting availability zones to [us-east-1b us-east-1e]
[ℹ] subnets for us-east-1b - public:192.168.0.0/19 private:192.168.64.0/19
[ℹ] subnets for us-east-1e - public:192.168.32.0/19 private:192.168.96.0/19
[ℹ] nodegroup "ng-1" will use "ami-0392bafc801b7520f" [AmazonLinux2/1.14]
[ℹ] nodegroup "ng-2" will use "ami-0392bafc801b7520f" [AmazonLinux2/1.14]
[ℹ] using Kubernetes version 1.14
[ℹ] creating EKS cluster "basic-cluster" in "us-east-1" region with un-managed nodes
[ℹ] 2 nodegroups (ng-1, ng-2) were included (based on the include/exclude rules)
[ℹ] will create a CloudFormation stack for cluster itself and 2 nodegroup stack(s)
[ℹ] will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
[ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-east-1 --cluster=basic-cluster'
[ℹ] CloudWatch logging will not be enabled for cluster "basic-cluster" in "us-east-1"
[ℹ] you can enable it with 'eksctl utils update-cluster-logging --region=us-east-1 --cluster=basic-cluster'
[ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "basic-cluster" in "us-east-1"
[ℹ] 2 sequential tasks: { create cluster control plane "basic-cluster", 2 parallel sub-tasks: { create nodegroup "ng-1", create nodegroup "ng-2" } }
[ℹ] building cluster stack "eksctl-basic-cluster-cluster"
[ℹ] deploying stack "eksctl-basic-cluster-cluster"
[✖] unexpected status "ROLLBACK_IN_PROGRESS" while waiting for CloudFormation stack "eksctl-basic-cluster-cluster"
[ℹ] fetching stack events in attempt to troubleshoot the root cause of the failure
[✖] AWS::EC2::SubnetRouteTableAssociation/RouteTableAssociationPrivateUSEAST1E: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EC2::NatGateway/NATGateway: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EC2::SubnetRouteTableAssociation/RouteTableAssociationPublicUSEAST1B: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EC2::SubnetRouteTableAssociation/RouteTableAssociationPrivateUSEAST1B: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EC2::SubnetRouteTableAssociation/RouteTableAssociationPublicUSEAST1E: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EKS::Cluster/ControlPlane: CREATE_FAILED – "Cannot create cluster 'basic-cluster' because us-east-1e, the targeted availability zone, does not currently have sufficient capacity to support the cluster. Retry and choose from these availability zones: us-east-1a, us-east-1b, us-east-1c, us-east-1d, us-east-1f (Service: AmazonEKS; Status Code: 400; Error Code: UnsupportedAvailabilityZoneException; Request ID: be715977-a421-4ad6-9dba-b4e907ca1ce8)"
[ℹ] 1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[ℹ] to cleanup resources, run 'eksctl delete cluster --region=us-east-1 --name=basic-cluster'
[✖] waiting for CloudFormation stack "eksctl-basic-cluster-cluster": ResourceNotReady: failed waiting for successful resource state
[✖] failed to create cluster "basic-cluster"
cloud_user:~/eks $
From CloudFormation for "eksctl-basic-cluster-cluster"
2019-12-19 18:12:08 UTC+0800 ControlPlane CREATE_FAILED Cannot create cluster 'basic-cluster' because us-east-1e, the targeted availability zone, does not currently have sufficient capacity to support the cluster. Retry and choose from these availability zones: us-east-1a, us-east-1b, us-east-1c, us-east-1d, us-east-1f (Service: AmazonEKS; Status Code: 400; Error Code: UnsupportedAvailabilityZoneException; Request ID: be715977-a421-4ad6-9dba-b4e907ca1ce8)
Hi,
I am facing similar issues. Trying to create a cluster at region us-east-1 but CloudFormation is rolling back due to: "us-east-1e, the targeted availability zone, does not currently have sufficient capacity to support the cluster."
$ aws ec2 describe-reserved-instances-offerings --filters 'Name=scope,Values=Availability Zone' --no-include-marketplace --instance-type m5.large | jq -r '.ReservedInst
ancesOfferings[].AvailabilityZone' | sort | uniq
us-east-1a
us-east-1b
us-east-1c
us-east-1d
us-east-1f
$ eksctl version
0.15.0
....
[ℹ] using Kubernetes version 1.14
[ℹ] creating EKS cluster "minimum-cluster" in "us-east-1" region with un-managed nodes
[ℹ] 1 nodegroup (ng-1) was included (based on the include/exclude rules)
[ℹ] will create a CloudFormation stack for cluster itself and 1 nodegroup stack(s)
[ℹ] will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
[ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-east-1 --cluster=minimum-cluster'
[ℹ] CloudWatch logging will not be enabled for cluster "minimum-cluster" in "us-east-1"
[ℹ] you can enable it with 'eksctl utils update-cluster-logging --region=us-east-1 --cluster=minimum-cluster'
[ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "minimum-cluster" in "us-east-1"
[ℹ] 2 sequential tasks: { create cluster control plane "minimum-cluster", create nodegroup "ng-1" }
[ℹ] building cluster stack "eksctl-minimum-cluster-cluster"
[ℹ] deploying stack "eksctl-minimum-cluster-cluster"
[✖] unexpected status "ROLLBACK_IN_PROGRESS" while waiting for CloudFormation stack "eksctl-minimum-cluster-cluster"
[ℹ] fetching stack events in attempt to troubleshoot the root cause of the failure
[✖] AWS::EC2::SubnetRouteTableAssociation/RouteTableAssociationPublicUSEAST1F: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EC2::NatGateway/NATGateway: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EC2::SubnetRouteTableAssociation/RouteTableAssociationPublicUSEAST1E: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EC2::SubnetRouteTableAssociation/RouteTableAssociationPrivateUSEAST1F: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EC2::SubnetRouteTableAssociation/RouteTableAssociationPrivateUSEAST1E: CREATE_FAILED – "Resource creation cancelled"
[✖] AWS::EKS::Cluster/ControlPlane: CREATE_FAILED – "Cannot create cluster 'minimum-cluster' because us-east-1e, the targeted availability zone, does not currently have sufficient capacity to support the cluster. Retry and choose from these availability zones: us-east-1a, us-east-1b, us-east-1c, us-east-1d, us-east-1f (Service: AmazonEKS; Status Code: 400; Error Code: UnsupportedAvailabilityZoneException; Request ID: e941b719-a19d-4861-b39e-e80dbb40d593)"
[ℹ] 1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[ℹ] to cleanup resources, run 'eksctl delete cluster --region=us-east-1 --name=minimum-cluster'
[✖] waiting for CloudFormation stack "eksctl-minimum-cluster-cluster": ResourceNotReady: failed waiting for successful resource state
For those who are stuck with setting the AZ in the config, it belongs under the nodegroup:
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
Before creating a feature request, please search existing feature requests to see if you find a similar one. If there is a similar feature request please up-vote it and/or add your comments to it instead
Why do you want this feature? Better use experience.
What feature/behavior/change do you want? It is possible to programmatically discover the availability zone in us-east-1 that EKS will fail to launch nodes in. It would be good to just avoid potential failures automatically, or at least to warn people that what they are trying to do will fail. You can list what types of instances are available in an az and the odd-one-out becomes very obvious.