hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
41.99k stars 9.47k forks source link

prevent_destroy should let you succeed #3874

Open ketzacoatl opened 8 years ago

ketzacoatl commented 8 years ago

Call me crazy, but I'm willing to call the current implementation of prevent_destroy a bug. Here is why: The current implementation of this flag prevents you from using it for 1/2 the use case.

The net result is more frustration when trying to get Terraform to succeed instead of destroying your resources.. prevent_destroy adds to the frustration more than alleviating it.

prevent_destroy is for these two primary use cases, right? 1) you don't want this resource to be deleted, and you want to see errors when TF tries to do that 2) you don't want this resource to be deleted, and you don't want to hear a peep out of TF - TF should skip over its usual prerogative to rm -rf on change.

I see no reason why TF must return an error when using prevent_destroy for the second use case, and in doing so, TF is completely ignoring my utterly clear directive to let me get work done. As a user, I end up feeling as though TF is wasting my time because I am focused on simple end goals which I am unable to attain while I spin my wheels begging TF to create more resources without destroying what exists.

You might say the user should update their plan to not be in conflict, and I would agree that is what you want to do in most cases.. but, honestly, that is not always the right solution for the situation at hand when using a tool like TF for the real-world. I believe in empowering users, and the current implementation of this flag prevents sensible use of the tool.

guangie88 commented 6 years ago

+1 too. I am a newbie to terraform too, and when I first encountered this issue, my natural tendency was to temporarily set all prevent_destroy = true, to prevent_destroy = false, not fully comprehending someone else's intent not to destroy the protected resource.

The alternative of specifying the targets on the other hand is far too lengthy and manual. In the end, my team had to come up with some script just to make the destroy process skip these resources, which then eventually became terraform-zap, to allow specifying external ignore files and simulating another terraform subcommand to skip protected resources and destroy all others.

Would be much better if the above is baked into terraform destroy itself, like with a zap mode. Either having prevent_destroy configuration to support varying values to specify the resource to be skipped for destroying, or simply having some flag to skip all protected resources, would be good enough.

aboron commented 6 years ago

Maybe a different lifecycle option like: only_create = true could be added to signify that the only action this item will ever need is to ensure that it has been initially created.

ap1969 commented 6 years ago

Hi, Another use case: mounted volumes in a docker environment.

I've got some digitalocean block storage which, initially, I'd like to provision and attach to the droplet using the digitalocean module.

that block storage is the persistent data for a mysql database.

I absolutely DO NOT want that destroyed if I rebuild the server!

Regards, Andy

mbklein commented 6 years ago

I'm going to chime in here with another use case. I use the aws_elasticbeanstalk_application_version resource. Terraform zips up my source bundle, uploads it to S3, creates a version out of it, and deploys that version to my beanstalk environments.

The next time I make a change to my source bundle and terraform apply, I get an error that it can't destroy the aws_elasticbeanstalk_application_version resource because it's in use (i.e., currently deployed). Which is true. And I don't want it to be destroyed. I want it to be retained and forgotten by Terraform, while a new version is created in its place.

Right now, I achieve that by executing terraform state rm aws_elasticbeanstalk_application_version.my_app before I plan or apply. But I'd love to have a resource flag similar to create_before_destroy that means create_and_then_forget_about_the_old_one_without_destroying_it.

Gary-Armstrong commented 6 years ago

Need to redeploy AWS instances but leave EIP unmolested. I can't figure out how to do this.

gaui commented 6 years ago

Gary, taint?

Gary-Armstrong commented 6 years ago

Ah, yes, @gaui , I wish that worked for my case, but taint does not trigger local-exec provisioners using when = "destroy", which is documented in #13549. I use this local-exec to delete node/client from my chef server when I destroy instances. The reason for that, is that I have static instance names for very persistent scientific computing instances. I usually forget to manually remove the node/client and then 20 minutes later I see my instances did not bootstrap. :) The local-exec trigger was 80% of the answer, but too bad it only works in some cases.

My wish: I would like to use destroy command for instances only (using -target is OK I guess but rather do it via HCL), preserve the EIP (it takes a LONG TIME for the US govt to adjust their firewall rules!), reattach them on redeployment (to the correct instance of course), and not have to manually remove Chef node/client when I perform a destroy.

Maybe there is a new way to do that which I don't know about? I am ready to learn.

gaui commented 6 years ago

@Gary-Armstrong Use a null_resource with a triggers { } clause.

dguisinger commented 6 years ago

And another use case

I have a terraform script managing uniform build pipelines, setting up AWS CodeCommit, CodeBuild, CodePipeline and lambda functions.

I don't mind CodeBuild, CodePipeline or lambda getting destroyed. But you can't recover a CodeCommit Git repository if Terraform decides to remove it.

I want Terraform to remember that I specified prevent_destroy even if the CodeCommit resource is no longer listed inside the script.

2.5 years guys on an issue EVERYONE wants solved other than Hashicorp, common, friggin do something..... or I'm going to move off Terraform to a platform I feel I can trust not to wipe out my hard work

tdmalone commented 6 years ago

Potentially duplicated by #16392

edward2a commented 5 years ago

Another use case: Provider: google Resources: google_storage_bucket. google_storage_bucket_acl The user creating this is a project editor (terraformer). When applying an ACL, it is necessary to keep terraformer as bucket owner so it can continue to manage it. On deletion, ACLs are removed, and then the terraformer user has no access:

* google_storage_bucket_acl.my_test_bucket_acl: Error deleting entity user-access-tester@<some-project>.iam.gserviceaccount.com ACL: googleapi: Error 403: terraformer@<some-project>.iam.gserviceaccount.com does not have storage.buckets.get access to my-test-bucket-33., forbidden

The template:


data "google_project" "<some-project>" {}

resource google_storage_bucket "my_test_bucket" {
  name      = "my-test-bucket-33"
  location  = "<region>"

  force_destroy = "true"
  project = "<some-project>"
  storage_class = "REGIONAL"
}

resource google_storage_bucket_acl "my_test_bucket_acl" {
  bucket  = "${google_storage_bucket.my_test_bucket.name}"

  default_acl = "private"

  role_entity = [
    # project-owners need to match default_acl (bug?)
    "OWNER:project-owners-${data.google_project.<some-project>.number}",
    "OWNER:user-terraformer@<some-project>.iam.gserviceaccount.com",
    "READER:user-access-tester@<some-project>.iam.gserviceaccount.com"
  ]

}

I tried the prevent_destroy, but that just screws up the destroy operation. I would expect a destroy_action=remove_state_only or something of the likes, as the ACL will be lost with the actual resource removal (the bucket in this case).

JackDavidson commented 5 years ago

I'm chiming in because this is clearly needed and nothing seems to be being done about it.

Here is yet another use case: I want to get a static IP, and I want to keep that IP even if I need to destroy and re-create my infrastructure.

Yes, I can do this by manually creating a static IP and using 'data' to reference it. But now there is a manual step, so I cant automatically spawn a copy of my infrastructure, or I need to use another tool in addition to terraform.

Xyrodileas commented 5 years ago

Ran into the same issue as the other usecase: I have 2 VPC, one created by Terraform, the other one imported with terraform import. I do not wish to delete the other VPC, and terraform fail when performing terraform destroy, as it cannot delete the other VPC (Default VPC in aws). I'm probably gonna try to hack my way around it.

bbakersmith commented 5 years ago

@Xyrodileas I believe for your use case you should use a data source, rather than an imported resource, since the vpc is managed entirely outside of terraform.

https://www.terraform.io/docs/providers/aws/d/vpc.html

tdmalone commented 5 years ago

@Xyrodileas As well as the data source mentioned above, you might want to look at https://www.terraform.io/docs/providers/aws/r/default_vpc.html and use that resource for managing the default VPC. Note in the docs how it mentions that because Terraform can't destroy it, it will just remove it from your state file instead.

Xyrodileas commented 5 years ago

Thank you very much, i will definitely take a look into both of your suggestions ! EDIT : As I might need to peer with another VPC than the default VPC in the future, i will work with data source instead of imported resource. I'll keep the aws_default_vpc in mind for similar use case though, thanks again!

edward2a commented 5 years ago

Hi, just referencing here as this might be related: https://github.com/hashicorp/terraform/issues/20065

abeluck commented 5 years ago

Is this being addressed in terraform 0.12?

clemenspeters commented 5 years ago

My current workaround is to set up two bash scripts.

The setup.sh adds existing resources which are protected with

  lifecycle {
    prevent_destroy = true
  }

to the terraform state (using terraform import). setup.sh:

#!/bin/sh
touch ./terraform.log                       # Create log file
export TF_LOG_PATH=./terraform.log          # Set log path
export TF_LOG=DEBUG                         # Activate logging / set log level

# Define variables
KEY_NAME="my-crypto-key"
KEY_TF_RESOURCE="google_kms_crypto_key.my_crypto_key"
REGION="europe-west3"
RING_NAME="my-key-ring"
RING_TF_RESOURCE="google_kms_key_ring.my_key_ring"
# Import resources to terraform state (which already exist in gcp)
# Comment the following two lines if you need to create the keys (first time)
terraform import $RING_TF_RESOURCE $REGION/$RING_NAME
terraform import $KEY_TF_RESOURCE  $REGION/$RING_NAME/$KEY_NAME

# Run Terraform
terraform plan -out terraformPlan
terraform apply "terraformPlan"

The destroy.sh script removes them from the terraform state, so they don't throw the error (using terraform state rm). destroy.sh:

#! /bin/sh

## Define variables
KEY_TF_RESOURCE="google_kms_crypto_key.my_crypto_key"
RING_TF_RESOURCE="google_kms_key_ring.my_key_ring"
## Remove resources from terraform state (they should not be deleted from gcp)
terraform state rm $KEY_TF_RESOURCE
terraform state rm $RING_TF_RESOURCE

# Run Terraform
terraform plan -out terraformPlan
terraform destroy
mattclegg commented 5 years ago

For the purpose of debugging this problem

---
 terraform/eval_check_prevent_destroy.go | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/terraform/eval_check_prevent_destroy.go b/terraform/eval_check_prevent_destroy.go
index 4dff0c84d..fd8fcf48c 100644
--- a/terraform/eval_check_prevent_destroy.go
+++ b/terraform/eval_check_prevent_destroy.go
@@ -9,6 +9,7 @@ import (

    "github.com/hashicorp/terraform/addrs"
    "github.com/hashicorp/terraform/configs"
+   "github.com/hashicorp/terraform/helper/logging"
    "github.com/hashicorp/terraform/tfdiags"
 )

@@ -29,13 +30,19 @@ func (n *EvalCheckPreventDestroy) Eval(ctx EvalContext) (interface{}, error) {
    change := *n.Change
    preventDestroy := n.Config.Managed.PreventDestroy

+   // Show the plan if debugging
+   logLevel := logging.LogLevel()
+   if logLevel == "INFO" {
+       return nil, nil
+   }
+
    if (change.Action == plans.Delete || change.Action.IsReplace()) && preventDestroy {
        var diags tfdiags.Diagnostics
        diags = diags.Append(&hcl.Diagnostic{
            Severity: hcl.DiagError,
            Summary:  "Instance cannot be destroyed",
            Detail: fmt.Sprintf(
-               "Resource %s has lifecycle.prevent_destroy set, but the plan calls for this resource to be destroyed. To avoid this error and continue with the plan, either disable lifecycle.prevent_destroy or reduce the scope of the plan using the -target flag.",
+               "Resource %s has lifecycle.prevent_destroy set, but the plan calls for this resource to be destroyed. To avoid this error and continue with the plan, either disable lifecycle.prevent_destroy or reduce the scope of the plan using the -target flag. If this is unexpected you can find out what is wrong using TF_LOG=INFO.",
                n.Addr.Absolute(ctx.Path()).String(),
            ),
            Subject: &n.Config.DeclRange,
@@ -46,4 +53,4 @@ func (n *EvalCheckPreventDestroy) Eval(ctx EvalContext) (interface{}, error) {
    return nil, nil
 }

-const preventDestroyErrStr = `%s: the plan would destroy this resource, but it currently has lifecycle.prevent_destroy set to true. To avoid this error and continue with the plan, either disable lifecycle.prevent_destroy or adjust the scope of the plan using the -target flag.`
+const preventDestroyErrStr = `%s: the plan would destroy this resource, but it currently has lifecycle.prevent_destroy set to true. To avoid this error and continue with the plan, either disable lifecycle.prevent_destroy or adjust the scope of the plan using the -target flag. If this is unexpected you can find out what is wrong using TF_LOG=INFO.`
mercuriete commented 5 years ago

I have same use case that @steve-gray I need something like terraform destroy --skip-protected

I have some ip address that I don't want to be deleted. so I have prevent_destroy on those ip address

and I wanted to save money on test environments by destroying all but ip addresses.

Heiko-san commented 4 years ago

Is there any "real" solution to this problem yet?

Actually I wouldn't even want to "--skip-protected" , maybe there should be another lifecycle key besides "prevent_destroy" saying "skip_destroy".

My use case is the rancher2 provider. I create a cluster and then create a project+namespace. If I delete the cluster, the namespace is automatically deleted aswell. But terraform tries to first destroy the namespace, which should theoretically work the same. But in practice the namespace deletion gets stuck in "removing" state that way. So I end up klicking delete in UI and rm my whole state file, which is ... well ... not so good ... if there are more resources in it then just the cluster itself.

Maybe terraform rm ... would work , but up to now the way I described above was the less painful way. As OP correctly stated: This is about productivity, so not wasting time with managing the managment system.

edward2a commented 4 years ago

@Heiko-san I was just thinking on a similar use case where a life cycle action could be 'destroy_state_only'.

Heiko-san commented 4 years ago

I feel the name "skip_destroy" would be more intuitive but finally it doesn't matter, the effect would be the same.

erickellerek1 commented 4 years ago
  on main.tf line 37:
  37: resource "openstack_blockstorage_volume_v2" "vol" {

Resource openstack_blockstorage_volume_v2.vol[0] has lifecycle.prevent_destroy
set, but the plan calls for this resource to be destroyed. To avoid this error
and continue with the plan, either disable lifecycle.prevent_destroy or reduce
the scope of the plan using the -target flag.

seems a dead end with the

  lifecycle {
    prevent_destroy = true
  }

if one does not exclude appropriate targets when destroying like explained by @mengesb Still no option to skip this error message?

RobbieMcKinstry commented 4 years ago

This is a real pain for me.

dansali commented 4 years ago

Will this ever get implemented? The logic seems pretty straight forward for me, just add skip_destroy = true.

eleni-salamani commented 4 years ago

+1

christhomas commented 4 years ago

Almost five years and not a single enhancement which would solve this problem. Guys, please, get a grip. Just allow the current force_destroy=(bool) arrangement but add an extra value force_destroy=(bool | continue) and if the resources prevent the plan then check and remove resources from the graph when "continue" is the value.

We have had five years of discussing this problem and we've come up with good solutions. But what are you guys at terraform going to do about it? Nothing it seems

mercuriete commented 4 years ago

another solution:

1) implement a list of object with no prevent destroy:

terraform state list --no-prevent-destroy

2) implement destroy targets:

terraform destroy -targets="target1 target2"

as alias of:

terraform destroy -target=target1 -target=target2

3) and then I can do this:

terraform destroy -targets="$(terraform state list --no-prevent-destroy)"

This is a lazy idea of doing it with the current console api or something similar. what do you think?

christhomas commented 4 years ago

No, that wouldn't make me happy. I want a configuration option. Not a command line thing.

JonWebster91 commented 4 years ago

No, that wouldn't make me happy. I want a configuration option. Not a command line thing.

I am also after a config option, not command line.

mercuriete commented 4 years ago

@JonWebster91 I totally agree.

Makeshift commented 4 years ago

I ran into this problem today as a Terraform newbie, assuming that prevent_destroy would act as described in the OP. I'm pretty disappointed to find that it just errors out instead of letting me continue without destroying the labelled resources.

What's the suggested workaround for this? I can see four options at the moment:

christhomas commented 4 years ago

resources that you want to be persistent across multiple ups and downs, should be put into their own stack where you manage their lifecycle independently from your application. Then if you pull it down and recreate it, your databases and things, that you're saving from destruction. Will remain in place.

That's the easiest way I have seen to do this so far.

dandill-eng commented 4 years ago

Same issue, I could use a config flag or fix to ignore resources on destroy. My use case was generating a key pair for my vm using tls_private_key, which I'd like to write back to my existing native cloud vault. Problem is that I just want to write the generated keys back to the vault, I don't need to include that native cloud vault as a state managed resource as destroying that vault would cause problems.

stefankaufmann commented 3 years ago

same here. I want to create a volume (if not present) and attach it to an instance. But the volume should not be deleted on destroy since I may reuse it. Missing this feature reduces usability so much.

igarciaes commented 3 years ago

You can create an independent module for your stateful resources with their own lifecycle and define a data source for them in your main module

stefankaufmann commented 3 years ago

You can create an independent module for your stateful resources with their own lifecycle and define a data source for them in your main module

True, but pretty annoying since I need to keep them in sync manually. I am a big friend of having one script which handles everything. Especially when working in a team, everyone would need to be aware that the scripts need to be executed in the right order.

mengesb commented 3 years ago

You can create an independent module for your stateful resources with their own lifecycle and define a data source for them in your main module

True, but pretty annoying since I need to keep them in sync manually. I am a big friend of having one script which handles everything. Especially when working in a team, everyone would need to be aware that the scripts need to be executed in the right order.

What @igarciaes says is the correct "terraform" path for your situation @stefankaufmann. The aws_instance resource can use an ebs_volume, and has constructs to attach it as the root device or as an additional device. destroy_on_termination is an attribute that can be used from the aws_instance and ebs_volume . To control the lifecycle of the ebs_volume is a different concern (as you expressed) and that may involve prevent_destry as lifecycle {} attribute, but what you want is to keep is the volume and not the instance (per what you've expressed). This is not the intended use case of my feature request - and greatly complicates things since, combined, you want to keep a piece of a resource that you don't want to manage independently. prevent_destroy is not an aws_instance attribute parameter for "parts" of the resource, it's for the resource as a whole.

The idea of the feature I'd like to see implemented is to allow us to have a prevent_destroy attribute set, and to allow it to gracefully "fail forward" - meaning a warning is emitted that it would have done the delete (WARN), but an attribute prevents it AND it allows the plan to continue to act (walk the DAG tree, continue operations, and finish). Ideally, this means that if the graph has a hard dependency (like depends_on), then that portion of the tree remains (and must remain) because the code declares it must remain and a prevent_destroy attribute plus the DAG tree still requires it. Ultimately I'll accept any exit code as a result, but I'd still like to see it exit 0 and not 1 (error), if anything add error codes so that we can understand the results (2 = OK with warnings, or something).

stefankaufmann commented 3 years ago

You can create an independent module for your stateful resources with their own lifecycle and define a data source for them in your main module

True, but pretty annoying since I need to keep them in sync manually. I am a big friend of having one script which handles everything. Especially when working in a team, everyone would need to be aware that the scripts need to be executed in the right order.

What @igarciaes says is the correct "terraform" path for your situation @stefankaufmann. The aws_instance resource can use an ebs_volume, and has constructs to attach it as the root device or as an additional device. destroy_on_termination is an attribute that can be used from the aws_instance and ebs_volume . To control the lifecycle of the ebs_volume is a different concern (as you expressed) and that may involve prevent_destry as lifecycle {} attribute, but what you want is to keep is the volume and not the instance (per what you've expressed). This is not the intended use case of my feature request - and greatly complicates things since, combined, you want to keep a piece of a resource that you don't want to manage independently. prevent_destroy is not an aws_instance attribute parameter for "parts" of the resource, it's for the resource as a whole.

The idea of the feature I'd like to see implemented is to allow us to have a prevent_destroy attribute set, and to allow it to gracefully "fail forward" - meaning a warning is emitted that it would have done the delete (WARN), but an attribute prevents it AND it allows the plan to continue to act (walk the DAG tree, continue operations, and finish). Ideally, this means that if the graph has a hard dependency (like depends_on), then that portion of the tree remains (and must remain) because the code declares it must remain and a prevent_destroy attribute plus the DAG tree still requires it. Ultimately I'll accept any exit code as a result, but I'd still like to see it exit 0 and not 1 (error), if anything add error codes so that we can understand the results (2 = OK with warnings, or something).

@mengesb Thanks for clarifying your request. I agree that my use case complicates things since I only want to partly remove a resource and it is not clear what to do with dependencies of that resource. Maybe I just have had wrong expectations on the prevent_destroy feature. Since I can add it to any resource in my script I would have assumed that adding it to the volume only will prevent destruction of the volume, but not adding it to the instance itself would destroy the instance. But I totally agree that a terraform destroy should do whatever possible and print warnings instead of failing. Just a side node, I am working with openstack and not aws ;-)

ketzacoatl commented 3 years ago

Hi @AzySir! I didn't realize this was so easy, though it sounds like you see how to make the update? are you frustrated no one else has figured it out or put in the work to create a PR for the task? I'm no gopher, but maybe you can explain what updates to make and where and someone like me could help out? Or maybe you are capable and interested enough in this issue to make your own PR?

ofbeaton commented 3 years ago

I really hope we get this sometime soon... not much hope with so many years.

I hear people wanting a configuration option, but for me just having two CLI flags would work great.

handlerbot commented 3 years ago

I came here from https://github.com/hashicorp/terraform/issues/3116 intending to post the same suggestion as @ofbeaton, and I am delighted to see I was beaten to it; I realized this week that a command-line flag that disabled prevent_destroy would handle all of my use cases and remove all of my annoyances with not being able to interpolate a false value into it dynamically.

(I literally yesterday added instructions to run perl -pi -e 's/prevent_destroy.*/prevent_destroy = false # NOCOMMIT/' filenames.tf, run Terraform to destroy the resources in question, and then revert that change. 🙃)

Could we get a comment from Terraform core folks about whether this approach passes baseline muster as an idea to pursue? I'm glad to write the code if there's a reasonable chance it or something approximating it would be accepted.

christhomas commented 3 years ago

I think the question I haven't yet got an answer for, but perhaps Terraform guys can chip in and explain a scenario where this would lead to negative consequences. Would skipping "prevent_destroy" resources and succeeding to delete other resources lead to a bad situation where I could have problems?

if I destroy an app, but I skip the s3 bucket destruction for obvious reasons, but fully complete the destroy and get to success at the end. What problems am I going to have?

It seems that there is push-back on this solution because of some reason, so what I'd like to know is why do the terraform devs think this solution of skipping resources labelled as prevent_destroy, is a bad solution?

mercuriete commented 3 years ago

@christhomas I think the solution they wanted us to do is something like 2 repository solution.

  1. first repository all objects are important (DB, networks)
  2. second repository non-important objects pointing to the first ones using datasources.

This solution is not suitable for automatic environments per branch but It is suitable for 99% of use cases. That is the idea that I came up with after all these years without a clear response from terraform team.

christhomas commented 3 years ago

@mercuriete I think you're right, maybe that is a good solution. But also you're also right in that this doesn't really work great in automated environments

ketzacoatl commented 3 years ago

There is also quite a bit of tension between the need to break up groups of resources into different projects for cases like this, and the need to group resources together for other purposes (such as operator sanity, setting up CI, updating variables, tracking repos, etc).

RobbieMcKinstry commented 3 years ago

@phinze can we get an update?

eechava66 commented 2 years ago

Any update with this?