Open ktrenchev opened 2 weeks ago
Voting for Prioritization
Volunteering to Work on This Issue
Hey @ktrenchev π Thank you for taking the time to raise this! While we understand Terraform configurations can get pretty complicated, it's often quite difficult to reproduce scenarios like this without any logging or configuration samples. Are you able to provide debug logs (redacted as necessary) if you're unable to provide a configuration as you'd initially indicated?
One thing that came to up when taking a quick look at this while triaging was the force_destroy
argument of the aws_rds_global_cluster
resource, which I believe is meant to help with this scenario. Are you able to confirm whether that argument has been configured?
Greetings @justinretzolk!,
Unfortunately I'm unable to provide debug logs. I did play around with the force_destroy
argument of RDS Global Cluster resource, but it had no effect. I dug around cluster.go
myself and my best estimation is:
1) Either the destruction of the RDS Global Cluster and associated RDS Clusters at one go is intentionally unsupported (AWS docs state something along the lines of "there is no 'one button push' deletion process as RDSs are usually mission critical").
2) The timeout in waitDBClusterDelete()
(called in resourceClusterDelete()
) is insufficient as earlier in resourceClusterDelete()
RemoveFromGlobalClusterWithContext()
is called on the replica and a promotion is triggered.
I'll be happy with a confirmation that the deletion of a Global RDS Cluster and associated RDS Clusters at one go is supported (meaning there is something wrong with my setup, which, unfortunately, is not unlikely).
Thanks for the additional information here @ktrenchev π Completely understand re:logging and configuration samples. I'll let someone from the team or community speak to some of the more specifics here.
Edit: I had a thought that using a later provider version may help, given that we've migrated most of the provider to use AWS SDK for Go V2. In doing so, I noticed the following in the release notes for 5.24.0
:
- resource/aws_rds_cluster: Avoid an error on delete related to unexpected state 'scaling-compute' (https://github.com/hashicorp/terraform-provider-aws/issues/34187)
It may be worth upgrading to at least provider version 5.24.0
and testing again to see if that bug fix resolves your particular issue.
@justinretzolk do you know what was changed , i still using same aws provider 5.0.0 like before , but since october 13 its start failing , i cant upgrade my provider to new version because i need to do a lot of changes in my terraform infrastructure
steps to reproduce , 1) create aws global db 2)add cluster on west region with one instance 3) add replica in east region with one instance 4) try to restack the complete cluster using snapshot
you can see that its start deleting the instance in east region , and then when trying to promote east cluster from the global db , it didn't wait to finish promoting and start the deletion directlly , so its failing on
Error: waiting for RDS Cluster (xxxx-dr-global-region-us-east-2-cluster) delete: unexpected state 'promoting', wanted target ''. last error: %!s(
@justinretzolk i already have the force_destroy on the aws_rds_global_cluster resource and it still happen
Terraform Core Version
0.13.7
AWS Provider Version
4.53.0
Affected Resource(s)
aws_rds_global_cluster aws_rds_cluster
Expected Behavior
I want to be able to delete both the RDS Global Cluster and the associated RDS Clusters with a single
terraform destroy
invocation.Actual Behavior
When
terraform destroy
is called it: 1) Detaches the replica RDS Cluster from the Global RDS Cluster, thus triggering a promotion. 2) Terraform waits for the replica RDS Cluster to be deleted, but times out as the replica RDS Cluster needs to first be promoted and then deleted, but the promotion process takes longed than the timeout. 3) The replica RDS Cluster is eventually deleted from AWS, but theterraform destroy
operation fails to delete the other RDS Cluster and the RDS Global Cluster. 4) A 2nd run ofterraform destroy
deletes the leftover RDS Global Cluster and RDS Cluster.Relevant Error/Panic Output Snippet
Terraform Configuration Files
N/A, setup is way too complicated to extract the exact configuration.
Steps to Reproduce
1) Create a new RDS Global Cluster. 2) Attach an RDS Cluster (primary). 3) Attach an RDS Cluster (replica). 4) Run
terraform destroy
.Debug Output
No response
Panic Output
No response
Important Factoids
No response
References
No response
Would you like to implement a fix?
None