Closed ccourtoyaxis closed 3 years ago
I don't think it is related to the AWS support plan but it sometimes helps to ask AWS support directly if they can see something on their end.
From the code point of view, I don't see anything unusual. I recommend reducing the number of resources to find the issue easier and check AWS service limits (EIP limit is often the reason).
Try to specify single_nat_gateway = true
to not hit EIP limit.
If the problem is reproducible and is related to this module, please provide small piece of code which triggers the problem and console output.
OK, thanks for the tip. I am far from my Elastic IP Quota though. Also, the gateways are created in the end, so either Terraform is unable to retieve the resource status or they just take too long to create. Is there a way to change the timouts in the module?
Regardless, I will implement a condition for our feature development deployments vs. staging and production. to limit the number of Elasitc IPs
We had this kind of issue with someone who was recreating VPC resources in the CI/CD pipeline multiple times a day but it was in 2018 and not since that. The solution (timeouts { create = "5m" }
) was added into this module since then.
I found the related issue and fix in the recent release of Terraform AWS provider: https://github.com/hashicorp/terraform-provider-aws/issues/19985 https://github.com/hashicorp/terraform-provider-aws/pull/21161
Could you try to use the previous or latest version of Terraform AWS provider to see if a problem is fixed? It is likely related to the provider and not to the module.
I found the following open issue with the provider. I think this is the root cause of the problem
https://github.com/hashicorp/terraform-provider-aws/issues/21032
Yes, looks like it. There is already #701 which can be extended to have longer timeouts for other resources, too. Will you be able to chime in and update that PR?
I'm going to lock this issue because it has been closed for 30 days β³. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Description
We currently deploy and destroy our VPCs regularly to test the deployment. It randomly, but frequently fails (I would say more than 50% of the time). The problem is that when I re-run terraform apply to continue where it left off, it complains that the resource already exist! I have to destroy th VPC and start from scratch again (praying that it won't fail this time).
Versions
Reproduction
All you need to do is deploy and destroy regularly.
Notes:
Code Snippet to Reproduce
VPC Terraform File
See attached the full terraform definition :
vpc_terraform.tar.gz
Expected behavior
I expect to be able to deploy without a glitch. However, failure happens, but at that point, it should be idempotent and I should be able to continue the execution simply by re-running terraform apply
Actual behavior
Deployment fails
Terminal Output Screenshot(s)
Additional context
We use the basic AWS support plan (i.e. no support), could this mean that the Service Level Agreement provides latencies that are outside of the module requirements? Still it takes a very long time to create Route Tables.