cloudfoundry-community-attic / aws-nat-bastion-bosh-cf

Set up a best practices Cloud Foundry with just a few commands.
2 stars 7 forks source link

make destroy runs into dependency violations, may leave leftovers #38

Closed jahio closed 8 years ago

jahio commented 8 years ago

Known issues with make destroy are that occasionally it may time out after 5 minutes claiming it ran into a "Dependency Violation" - meaning that it couldn't delete "this thing" because it's got these other things that have to be deleted first. For example:

Error applying plan:

3 error(s) occurred:

* aws_security_group.cf: DependencyViolation: resource sg-1a15d27c has a dependent object
    status code: 400, request id: c14c318d-c0e6-4b4d-8946-bc6155b83493
* aws_subnet.cfruntime-2b: Error deleting subnet: timeout while waiting for state to become '[destroyed]'
* aws_subnet.microbosh: Error deleting subnet: timeout while waiting for state to become '[destroyed]'

Supposedly, calling a delete on an object with dependencies should cascade down deleting the dependencies before the "main" object, but this isn't always the case. It looks like we have a few problems here:

In the first case, we need to have make destroy figure out for itself what the dependent objects are, delete them, then come back to the original object and try again. It needs to be able to do this in some kind of loop and do so recursively so that if A has dependent object B, and B has dependent object $N, and $N has dependent object X it can recursively figure those out, go down the chain, delete X, then crawl back up the chain: $N, then B and finally A.

However, it's possible during any of those deletion calls, that the AWS API takes longer than the default timeout of 5 minutes to carry out a given request. If I say "Hey AWS, go delete object X" and it takes > 5 minutes, in theory this could be what triggers a DependencyViolation from AWS - because even though the task to delete X is "in the queue", it hasn't been done yet so any attempt to delete $N will have a Dependency Violation because we're still waiting on AWS to actually delete X like we told it to.

Finally, we've seen cases where things not created by Terraform aren't cleaned up with make destroy. For example, the bosh/0 instance. Terraform didn't create it so either it isn't aware of it or it otherwise refuses to delete it.

All of these cases should be gracefully handled by make destroy.

7hunderbird commented 8 years ago

Closing in favor of existing issue #26.