open-infrastructure-labs / ops-issues

0 stars 0 forks source link

ACM hits Elastic IPs quota on AWS #28

Open tumido opened 3 years ago

tumido commented 3 years ago

I've tried spinning up a "test" cluster in AWS via ACM. It's a cluster of 1 master node and 3 worker nodes. Just a small one. The provisioning failed, because I was deploying it to a region where the Elastic IPs quota was set to default=5.

failed to fetch Cluster: failed to fetch dependency of \"Cluster\": failed to generate asset \"Platform Quota Check\": error(MissingQuota): ec2/L-0263D0A3 is not available in us-east-1 because The required number of resources (6) is more than the limit of 5

Later on we discovered that we have another region in our account with this quota set to 10 so we redo the provisioning to that different region with the same cluster settings. The cluster deployed successfully, though there are only 3 allocated Elastic IPs.

image

Why it requires the quota to be 6 if it allocates only 3 Elastic IPs?

cc @ipolonsk @idanlv @cdoan1

cdoan1 commented 3 years ago

This should be a requirement from OpenShift IPI installation

https://docs.openshift.com/container-platform/4.6/installing/installing_aws/installing-aws-account.html#installation-aws-limits_installing-aws-account

"Each private subnet requires a NAT Gateway, and each NAT gateway requires a separate elastic IP. " "The cluster deploys one NAT gateway in each availability zone."

So, 3 for node, 3 for NAT gateway.

tumido commented 3 years ago

@cdoan1 Not sure where you deduced 3 for nodes + 3 for gateway. The docs says:

image

I deduce this mapping from it: 1 private subnet -> 1 NAT Gateway -> 1 elastic IP

And since we're deploying to 3 avail zones, that is 3x1=3. And 3 are actually being allocated, 3 are being billed. Why do you think it should be 6?

Also the second highlighted sentence says:

To install a cluster in a region with more than five availability zones, you must increase the EIP limit.

Which also supports my finding, because the default limit is 5. Therefore if you install into more than 5 avail zones, you need more than 5 on quota.

I don't see anywhere in the docs to specify 1 additional EIP per node or multiply by 2... Where do you read that?

ipolonsk commented 3 years ago

Hey, I also tried to create a test cluster in AWS via ACM with my credentials, running into the same issues as @tumido, On the first try I used "us-east-1" which is the default in the list and got the same error:

error(MissingQuota): ec2/L-0263D0A3 is not available in us-east-1 because the required number of resources (6) is more than the limit of 5"

I tried to change the region to "eu-west-2" out of curiosity and expected to see the same result, but the second time the number of resources that was asked was different:

\"Platform Quota Check\": error(MissingQuota): ec2/L-1216C47A is not available in eu-west-2 because the required number of resources **(24)** is more than the limit of 5"

I didn't change the default amount of masters and workers. @cdoan1 how can it be that the required number of resources is changed?