Allow overriding `lifecycle.prevent_destroy` with environment variable

hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.

https://www.terraform.io/

Other

42.53k stars 9.52k forks source link

Allow overriding `lifecycle.prevent_destroy` with environment variable #30957

Open 0xch4z opened 2 years ago

0xch4z commented 2 years ago

Current Terraform Version

v1.1.8

Use-cases

It's reasonable to want to protect important resources by marking them with prevent_destroy = true. This is often a useful guard when using some automated deployment tool like Atlantis -- it gives you peace of mind that your very important resource won't get destroyed due to a bug.

The trouble is when you do want to destroy the resource and it's located in a nested module. There is a constraint for the lifecycle block, the value of prevent_destroy needs to be known at runtime. But how can you convey at runtime that you want to destroy a resource you don't have access to edit?

Attempted Solutions

Assigning prevent_destroy to a variable. This results in the following error:


│ Error: Unsuitable value type
│
│   on .terraform/modules/some_module/main.tf line 23, in resource "some_resource" "something":
│   23:         prevent_destroy = !local.disable_dns_protection
│
│ Unsuitable value: value must be known
╵

╷ │ Error: Variables not allowed │ │ on .terraform/modules/some_module/main.tf line 23, in resource "some_resource" "something": │ 23: prevent_destroy = !local.disable_dns_protection │ │ Variables may not be used here.


- Adding a destroy-time provisioner which exits with a non-zero code if a variable is not set. This works but is obviously not ideal.

### Proposal
I'm bad at naming, but perhaps we could have some environment variable, which when set, any specified resource paths *can* be destroyed regardless of their `prevent_destroy` value.
`TF_ALLOW_DESTROY=module.my-module.provider.resource.my-special-resource,module.my-other-module.* terraform destroy`

This would override the `prevent_destroy` for:
- `resource.my-special-resource` in module `my-module`
- any resource in the `my-other-module` resource

crw commented 2 years ago

Thanks for taking the time to fill out this enhancement request! I've added it to our list of Lifecycle enhancements.

Future viewers, please use the 👍 reaction on the original post to support this issue.

dbarvitsky commented 2 years ago

My $0.02 with some motivation.

This becomes a notable pain in the neck when you deal with third-party Terraform modules that chose to protect resources with prevent_destroy (by the way, If you author those, please never-ever do this, this is a terrible idea).

Real-life situation: we just did a trial with a company, deployed their tooling with Terraform, decided not to go forward with them, and now have to reverse-engineer their TF modules to clean things up.

While searching for a way to do this, I came across an article on "parameterizing" the prevent_destroy, which is recommending a rather grotesque way of doing the same thing, arguably reducing the value of prevent_destroy to 0. IMHO, also not a great approach.

I personally would have been much happier if prevent_destroy resources were just skipped during tf destroy and actually nuked if they are specified through tf destroy --target my.precious.thing with some flags along the lines of "yes I really mean it, Terraform".

Meanwhile, I'd argue against prevent_destroy use altogether, unless it is in a top-level module and is managed by you exclusively.

burkesbi commented 2 years ago

The TF_ALLOW_DESTROY environment variable with explicit naming would be great, or an additional target on the commandline for --force-destroy-target or something.

I'm currently tearing down a lot of old systems we've replaced and there's prevent_destroy things everywhere for eips and s3 buckets, and having to dig up the relevant terraform module and patch it with =false is both tedious and error prone.

Just some way of saying "I'm a human who actually had to think about this" would be good. :)

teneko commented 2 years ago

A workaround that only works for resources and environments having access to bash:

provisioner "local-exec" {
  interpreter = ["bash", "-c"]
  when        = destroy
  command     = "if [ \"$${TF_VAR_allow_destroy,,}\" = \"true\" ]; then exit 0; else echo 'Destruction has been prevented. Requires \"TF_VAR_allow_destroy=true\" environment variable to proceed!'; exit 1; fi"
  on_failure  = fail
}

:warning: When commenting this out it behaves the same as when commenting out lifecycle { prevent_destroy = true }.

Some fun facts:

"TF_VAR_allow_destroy" can be any name you want.
$${} escapes terraform's ${}.
"$${TF_VAR_allow_destroy,,}" leads bash to convert upper to lowercase.
You should make other resources dependent on the resource having this provisioner that you don't want to have removed, otherwise terraform will proceed happily to remove other resources that do not have this provisioner too.
Keep in mind that the provisioner order matters, so upper provisioner are executed first.
Because it is a destruction provisioner you cannot use ANY variables outside of the resource.

For more details about destroy-time provisioners: https://www.terraform.io/language/resources/provisioners/syntax#destroy-time-provisioners

ausfestivus commented 1 year ago

In my use case, I want to have prevent_destroy = true when var.landscape = prod. Otherwise, it should be prevent_destroy = false.

If prevent_destroy was just a standard attribute on all Resources, instead of nested under the lifecycle meta-argument block, it would be easy for me to implement using normal conditional attribute syntax.

windonis commented 1 year ago

A workaround that only works for resources and environments having access to bash:
provisioner "local-exec" {
  interpreter = ["bash", "-c"]
  when        = destroy
  command     = "if [ \"$${TF_VAR_allow_destroy,,}\" = \"true\" ]; then exit 0; else echo 'Destruction has been prevented. Requires \"TF_VAR_allow_destroy=true\" environment variable to proceed!'; exit 1; fi"
  on_failure  = fail
}
⚠️ When commenting this out it behaves the same as when commenting out lifecycle { prevent_destroy = true }.

Some fun facts:

"TF_VAR_allow_destroy" can be any name you want.

$${} escapes terraform's ${}.

"$${TF_VAR_allow_destroy,,}" leads bash to convert upper to lowercase.

You should make other resources dependent on the resource having this provisioner that you don't want to have removed, otherwise terraform will proceed happily to remove other resources that do not have this provisioner too.

Keep in mind that the provisioner order matters, so upper provisioner are executed first.

Because it is a destruction provisioner you cannot use ANY variables outside of the resource.

For more details about destroy-time provisioners: https://www.terraform.io/language/resources/provisioners/syntax#destroy-time-provisioners

Amazing idea ! Thanks

gorecki-k commented 1 year ago

Hi, I have an additional use case. Some resources have flags like force_destroy for instance google_storage_bucket when the destroy is enforced even though the bucket is not empty. Terraform is conducting destroy and return error that can't be destroyed leaving the environment in an unstable state that requires manual intervention. So I would like Terraform to not even try to destroy the storage if that flag is not set to true with prevent_destroy

resource "google_storage_bucket" "bucket" {
  name                        = var.name
  project                     = var.project_id
  location                    = var.location
  force_destroy               = var.force_destroy
  lifecycle {
    prevent_destroy = !var.force_destroy # prevent destroy enabled if force destroy is disabled
  }
}

patrickherrera commented 1 year ago

A workaround that only works for resources and environments having access to bash:
provisioner "local-exec" {
  interpreter = ["bash", "-c"]
  when        = destroy
  command     = "if [ \"$${TF_VAR_allow_destroy,,}\" = \"true\" ]; then exit 0; else echo 'Destruction has been prevented. Requires \"TF_VAR_allow_destroy=true\" environment variable to proceed!'; exit 1; fi"
  on_failure  = fail
}
⚠️ When commenting this out it behaves the same as when commenting out lifecycle { prevent_destroy = true }.

Some fun facts:

"TF_VAR_allow_destroy" can be any name you want.

$${} escapes terraform's ${}.

"$${TF_VAR_allow_destroy,,}" leads bash to convert upper to lowercase.

You should make other resources dependent on the resource having this provisioner that you don't want to have removed, otherwise terraform will proceed happily to remove other resources that do not have this provisioner too.

Keep in mind that the provisioner order matters, so upper provisioner are executed first.

Because it is a destruction provisioner you cannot use ANY variables outside of the resource.

For more details about destroy-time provisioners: https://www.terraform.io/language/resources/provisioners/syntax#destroy-time-provisioners

Thanks for sharing that idea. I've just implemented that with a few tweaks that people might benefit from (until the underlying issue is actually fixed of course). This requires Terraform 1.4 but maybe could be done with a null_resource too. The idea is to establish the check as something that needs to run before everything else that you want to prevent deletion of (which ideally is everything otherwise some things get deleted before the specifically protected ones). So by making this a dependency of everything, it means that when you come to destroy, this gets destroyed first, so will "fail fast" if the flag is not set.

resource "terraform_data" "check_prevent_destroy" {
  provisioner "local-exec" {
    interpreter = ["bash", "-c"]
    when        = destroy
    on_failure  = fail
    command     = <<EOF
      if [ "$${TF_VAR_prevent_destroy:-true}" != "false" ]; then
        echo 'Destruction has been prevented. Set `TF_VAR_prevent_destroy` (as an external ENV variable) to `false` to enable'
        exit 1
      fi
  EOF
  }

  # Add all the things you want to protect here, which ensures that this resource is executed first when destroying
  depends_on = [
    aws_msk_cluster.msk_cluster,
    aws_msk_configuration.cluster,

    aws_kms_key.msk_encryption_key,
    aws_kms_alias.msk_encryption_key_alias,

   # etc...
  ]
}

I haven't done extensive testing, but can confirm that on a destroy this gets run first.

ncrothe commented 1 year ago

Related use case: We set up a set of resource and flag one or more with prevent_destroy to prevent accidental deletion. And then we deploy that set to multiple environment. Later, we want to remove one of those environments and destroy any related resources which fails due to the protection. Would very much like a "Yes, I really, really want to destroy stuff" option.

And like dbarvitsky says: skipping protected resources instead of failing the whole action would also be very helpful.

simonebenati commented 10 months ago

A workaround that only works for resources and environments having access to bash:
provisioner "local-exec" {
  interpreter = ["bash", "-c"]
  when        = destroy
  command     = "if [ \"$${TF_VAR_allow_destroy,,}\" = \"true\" ]; then exit 0; else echo 'Destruction has been prevented. Requires \"TF_VAR_allow_destroy=true\" environment variable to proceed!'; exit 1; fi"
  on_failure  = fail
}
⚠️ When commenting this out it behaves the same as when commenting out lifecycle { prevent_destroy = true }.

Some fun facts:

"TF_VAR_allow_destroy" can be any name you want.

$${} escapes terraform's ${}.

"$${TF_VAR_allow_destroy,,}" leads bash to convert upper to lowercase.

You should make other resources dependent on the resource having this provisioner that you don't want to have removed, otherwise terraform will proceed happily to remove other resources that do not have this provisioner too.

Keep in mind that the provisioner order matters, so upper provisioner are executed first.

Because it is a destruction provisioner you cannot use ANY variables outside of the resource.

For more details about destroy-time provisioners: https://www.terraform.io/language/resources/provisioners/syntax#destroy-time-provisioners
Thanks for sharing that idea. I've just implemented that with a few tweaks that people might benefit from (until the underlying issue is actually fixed of course). This requires Terraform 1.4 but maybe could be done with a null_resource too. The idea is to establish the check as something that needs to run before everything else that you want to prevent deletion of (which ideally is everything otherwise some things get deleted before the specifically protected ones). So by making this a dependency of everything, it means that when you come to destroy, this gets destroyed first, so will "fail fast" if the flag is not set.
resource "terraform_data" "check_prevent_destroy" {
  provisioner "local-exec" {
    interpreter = ["bash", "-c"]
    when        = destroy
    on_failure  = fail
    command     = <<EOF
      if [ "$${TF_VAR_prevent_destroy:-true}" != "false" ]; then
        echo 'Destruction has been prevented. Set `TF_VAR_prevent_destroy` (as an external ENV variable) to `false` to enable'
        exit 1
      fi
  EOF
  }

  # Add all the things you want to protect here, which ensures that this resource is executed first when destroying
  depends_on = [
    aws_msk_cluster.msk_cluster,
    aws_msk_configuration.cluster,

    aws_kms_key.msk_encryption_key,
    aws_kms_alias.msk_encryption_key_alias,

   # etc...
  ]
}
I haven't done extensive testing, but can confirm that on a destroy this gets run first.

Hello @patrickherrera Could you share more informations on how to use your workaround? In my code I copy pasted it but the resource terraform_data isn't being parsed at all and therefore not preventing resource deletion..

patrickherrera commented 10 months ago

Hi @simonebenati, what was the specific operation you were performing? This only works for a Terraform destroy because that is the only circumstance under which the "destroy" provisioner of the terraform_data resource is called. @teneko's solution has the advantage that it will run in order to prevent a destructive operation (i.e the protected resource needs to be destroyed and recreated in order to affect a change) and will also cope with changes like reducing a count value on a resource. I don't think there is any way to support resources being removed entirely from the TF source files as TF no longer knows about the provisioners in the first place.