hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.32k stars 9.49k forks source link

Terraform apply destroys resources that are still referenced from other resources #31309

Open prashantv opened 2 years ago

prashantv commented 2 years ago

Terraform Version

Terraform v1.2.3
on darwin_arm64

Terraform Configuration Files

This configuration relies on a dummy provider, see github/prashantv/terraform-dep-order-repro

resource "tftest_notifier" "n1" {
  email = "n1@example.com"
}

resource "tftest_notifier" "n2" {
  email = "n2@example.com"
}

resource "tftest_policy" "p1" {
  notifier_ids = [
    tftest_notifier.n1.id,
    tftest_notifier.n2.id,
  ]
}

ref

After applying the above, remove "n2" and the reference to it,

resource "tftest_notifier" "n1" {
  email = "n1@example.com"
}

resource "tftest_policy" "p1" {
  notifier_ids = [
    tftest_notifier.n1.id,
  ]
}

Applying this will fail as n2 is destroyed first, while there is still a policy referencing it.

Debug Output

https://gist.github.com/prashantv/45859f8607e690ff22990c310381f8c8

Expected Behavior

When the config removing "n2" is applied, the reference is removed first, and then n2 is removed successfully.

Actual Behavior

Terraform tries to remove "n2" before updating the policy reference, and the remove fails since the notifier is still referenced by a policy.

│ Error: failed to delete Notifier: cannot delete notifier, as policies [policy-174441.453993] still refer to it

Steps to Reproduce

Full steps for reproduction are in the repo.

Summary is:

--- a/repro/main.tf
+++ b/repro/main.tf
@@ -11,13 +11,13 @@ resource "tftest_notifier" "n1" {
   email = "n1@example.com"
 }

-resource "tftest_notifier" "n2" {
-  email = "n2@example.com"
-}
+# resource "tftest_notifier" "n2" {
+# email = "n2@example.com"
+# }

 resource "tftest_policy" "p1" {
   notifier_ids = [
     tftest_notifier.n1.id,
-    tftest_notifier.n2.id,
+    # tftest_notifier.n2.id,
   ]
 }

Additional Context

There is a workaround which causes the destroy to happen after the update: setting the lifecycle meta-argument create_before_destroy. This has to be set on the resoruce and applied before the delete operation. However, this workaround has some issues:

Ideally the solution would be:

References

There are a few existing issues which are almost all closed:

create_before_destroy is recommended in many of the above, but as mentioned in "Additional Context", it's a workaround with drawbacks rather than a solution to this problem.

jbardin commented 2 years ago

Hi @prashantv,

Thanks for filing the issue, and the extensive references you've collected! As you've mentioned, the only solution presented by Terraform to obtain this order of operations is create_before_destroy. The additional enhancements proposed here are not new ideas, but have not been implemented for a couple of primary reasons:

Unfortunately this means as of now the problem falls mostly on the providers. Any resource which requires this sort of "registration pattern" should be designed to allow create_before_destroy, at least optionally, by using auto-generated identifiers. The need to put create_before_destroy in the configuration is up to the provider to document, but also the user to discover, which I also agree is not optimal here.

Even in the case that a usable provider-forced option for a new ordering were possible, it would still be up to the providers to implement. Seeing the number of cases where create_before_destroy wasn't documented or allowed by resources which require it, it might not meaningfully reduce the number of overall bugs encountered.

While we're not opposed to the possibility of such a feature, there haven't yet been any proposals which could meat the two fundamental requirements of; not imposing changes on resources outside of that provider's control, not causing cycles in arbitrarily large or complex graphs. Since Terraform is working as designed here, I'm going to re-label this as an enhancement proposal. This can more or less serve as the core equivalent of https://github.com/hashicorp/terraform-plugin-sdk/issues/585, which is meant to cover the same usability aspects.

Thanks!