hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.4k stars 9.5k forks source link

on-destroy provisioners not being executed #13549

Closed IOAyman closed 4 months ago

IOAyman commented 7 years ago

Terraform Version

v0.9.2

Affected Resource(s)

Terraform Configuration Files

...
resource "digitalocean_droplet" "kr_manager" {
  name     = "${var.do_name}"
  image    = "${var.do_image}"
  region   = "${var.do_region}"
  size     = "${var.do_size}"
  ssh_keys = [XXX]

  provisioner "local-exec" {
    command = "echo ${digitalocean_droplet.kr_manager.ipv4_address} >> hosts"
  }
  provisioner "remote-exec" {
    inline = ["dnf install -y python python-dnf"]
    connection {
      type        = "ssh"
      user        = "${var.ssh_user}"
      private_key = "${file(var.ssh_key)}"
      timeout     = "1m"
    }
  }
  provisioner "local-exec" {
    command = "ansible-playbook ${var.play}"
  }
  provisioner "local-exec" {
    command = "docker-machine create --driver generic --generic-ip-address ${digitalocean_droplet.kr_manager.ipv4_address} --generic-ssh-key ${var.ssh_key} ${var.do_name}"
  }
  provisioner "local-exec" {
    when    = "destroy"
    command = "rm hosts"
  }
  provisioner "local-exec" {
    when    = "destroy"
    command = "docker-machine rm -f ${var.do_name}"
  }
}
...

Debug Output

https://gist.github.com/IOAyman/3e86d9c06d03640786184c1429376328

Expected Behavior

It should have run the on-destroy provisioners

Actual Behavior

It did not run the on-destroy provisioners

Steps to Reproduce

  1. terraform apply -var-file=infrasecrets.tfvars
  2. terraform destroy -var-file=infrasecrets.tfvars

References

Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? For example:

mkrakowitzer commented 4 years ago

I have this issue too inside a module.

  lifecycle {
    create_before_destroy = true
  }
  provisioner "local-exec" {
    when    = destroy
    command = "triton-docker exec -i ${self.name} consul leave"
  }

this works when run with --destroy, but not with --apply when the resources are destroyed and replaced with new versions

If I remove the create_before_destroy = true then the --apply works as expected and executes the local-exec, destroys the resource and creates a new resource, but I don't wont to set this to false.

kidmis commented 3 years ago

Guys, is there any updates? Can we expect to some additional values for when command. Something like:

In additional this can be combined in lists for best effort, e.g.: when = [ tainted, on_destroy ]

shebang89 commented 3 years ago

@kidmis I wouldn't expect for this to happen anytime soon, as was told by teamterraform in 2019:

Hi everyone,

This issue is labelled as an enhancement because the initial design of the destroy-time provisioners feature intentionally limited the scope to run only when the configuration is still present, to allow this feature to be shipped without waiting for a more complicated design that would otherwise have prevented the feature from shipping at all.

We're often forced to make compromises between shipping a partial feature that solves a subset of problems vs. deferring the whole feature until a perfect solution is reached, and in this case we decided that having this partial functionality was better than having no destroy-time provisioner functionality at all.

The limitations are mentioned explicitly in the documentation for destroy-time provisioners, and because provisioners are a last resort we are not prioritizing any development for bugs or features relating to provisioners at this time. We are keeping this issue open to acknowledge and track the use-case, but do not expect to work on it for the foreseeable future.

Please note also that our community guidelines call for kindness, respect, and patience. We understand that it is frustrating for an issue to not see any updates for a long time, and we hope this comment helps to clarify the situation and allow you all to look for alternative approaches to address the use-cases you were hoping to meet with destroy-time provisioners.

I suggest you use terraform destroy -target=RESOURCE as it triggers the destroy-time provisioners. If resource dependencies are chained OK, triggering this on a top-level resource in the dependency tree can do the job on the entire tree. This can be used for automated deployments like this: 1. terraform destroy -target=RESOURCE 2. terraform plan 3. terraform apply -auto-approve

ganniterix commented 3 years ago

I want to add another use case. Creating instances with the VMware provisioner inside a module. The script does not get executed when the parent module is removed from the workspace.

resource "null_resource" "decomission" {
  lifecycle {
    create_before_destroy = true
  }

  triggers = {
    id          = vsphere_virtual_machine.instance.id
    user        = var.vm_config.configuration_profile.provisioning_host_user
    private_key = var.vm_config.configuration_profile.provisioning_host_key
    host        = split("/", local.vm_config.network_layout.nics[0].ipaddresses[0])[0]

    bastion_host        = var.vm_config.configuration_profile.provisioning_bastion_use ? var.vm_config.configuration_profile.provisioning_bastion_host : null
    bastion_user        = var.vm_config.configuration_profile.provisioning_bastion_use ? var.vm_config.configuration_profile.provisioning_bastion_use : null
    bastion_private_key = var.vm_config.configuration_profile.provisioning_bastion_use ? var.vm_config.configuration_profile.provisioning_bastion_key : null
    bastion_use         = var.vm_config.configuration_profile.provisioning_bastion_use
  }

  provisioner "file" {
    when = destroy

    destination = "/tmp/decommision.sh"
    content     = templatefile("${path.module}/scripts/decommision.tpl", {})

    connection {
      type        = "ssh"
      user        = self.triggers.user
      private_key = self.triggers.private_key
      host        = self.triggers.host

      bastion_host        = self.triggers.bastion_use ? self.triggers.bastion_key : null
      bastion_user        = self.triggers.bastion_use ? self.triggers.bastion_user : null
      bastion_private_key = self.triggers.bastion_use ? self.triggers.bastion_private_key : null
    }
  }

  provisioner "remote-exec" {
    when = destroy

    inline = [
      "bash -x /tmp/decommision.sh"
    ]

    connection {
      type        = "ssh"
      user        = self.triggers.user
      private_key = self.triggers.private_key
      host        = self.triggers.host

      bastion_host        = self.triggers.bastion_use ? self.triggers.bastion_key : null
      bastion_user        = self.triggers.bastion_use ? self.triggers.bastion_user : null
      bastion_private_key = self.triggers.bastion_use ? self.triggers.bastion_private_key : null
    }
  }
}

Running this with anything other than terraform apply is not really an option, since this is meant into a VCS managed Terraform Enterprise workspace.

kayman-mk commented 2 years ago

Any news hiere? terraform destroy is not an option as the problem here occurs in a 3rd party module. As I do not know the internals it makes no sense to do this destroy command. Would really be better if it works with terraform apply.

stonefield commented 2 years ago

It is difficult to understand the reasoning by the terraform team here. This ticket has been open for more than 4 years, and I believe anyone having this need, do not understand why when = destroy is related to the terraform destroy command. The output from terraform when a module has been removed is clearly stating that the resource will be destroyed, hence a bug and not an enhancement. I firmly believe that anything related to destroy related triggers should be stored in the state file.

ganniterix commented 2 years ago

It would be great if something could be done about this.

arbourd commented 2 years ago

I'm confused.

when = destroy (in my case, #31266) works fine with terraform apply as long as that resource is not using create_before_destroy lifecycle hook

If this is the case: shouldn't this issue be out-of-date, closed, and replaced with something more accurate? @jbardin?

jbardin commented 2 years ago

@arbourd, Sorry I'm not sure what you mean. This issue is tracking the cases where destroy provisioners can't currently be executed, with deposed resources from create_before_destroy being one of those cases.

arbourd commented 2 years ago

Right, but this issue is very old and has a bunch of complaints about it not working on "apply" at all.

Are the two major issues right now:

Is that correct?

jbardin commented 2 years ago

@arbourd, unfortunately any longstanding issues tend to accumulate large amounts of unrelated or unnecessary comments. Destroy provisioners were implemented with some fundamental shortcomings which are difficult to incorporate into the normal resource lifecycle. The summary above is still representative of the current situation.

arbourd commented 2 years ago

For those of us using ssh provisioners with remote-exec, there is a resource that works pretty much the same: https://github.com/loafoe/terraform-provider-ssh

when = "destroy" support was just added

TomKulakov commented 1 year ago

Can you at least inherit from or clone null_resource into new kind of resource (custom_resource is taken, so maybe undetermined_resource) and implement proper behavior? update/change handler/event when resource is being replaced, actually run destroy when destroyed, not only when terraform destroy is being executed. I'm not a specialist here, but only throwing an idea based on my experience with terraform, what I've read here and in another similar topics and using my judgement. ;)

Now some more spam with confirmations: I can confirm same issue here in v1.3.7.

Apply does not trigger destroy but it's saying it does.

module.export_data.null_resource.export1: Destroying... [id=12345678901234567890]
module.export_data.null_resource.export1: Destruction complete after 0s

And when I did use terraform destroy the behaviour is correct:

module.export_data.null_resource.export1: Destroying... [id=12345678901234567890]
module.export_data.null_resource.export1: Provisioning with 'local-exec'...
module.export_data.null_resource.export1: (local-exec): Executing: ["/bin/sh" "-c" "echo \"DESTROYING  XO XO XO XO XO\"\n"]
module.export_data.null_resource.export1: (local-exec): DESTROYING  XO XO XO XO XO
module.export_data.null_resource.export1: Destruction complete after 0s

Also apply causing replace does trigger destroy as well. It would be better to have update/change handler instead.

Provisioner definition:

    provisioner "local-exec" {
        command = <<EOT
            echo "DESTROYING XO XO XO XO XO"
            EOT
        on_failure = fail
        when = destroy
    }

Right now to bypass this problem I'm creating purging null_resource which has to be removed from the code after some time. It's ultra uncomfortable and unexpected if someone is using my modules (I have now two of those) for the first time. Instead, resource could be removed manually if someone has permissions of course, but then, it's not following IaC.

artuross commented 1 year ago

I am surprised this issue is still open after all this time. I have the simplest case: I want to recreate my k8s control planes and for obvious reasons, I first need new servers to be created before I can destroy old servers.

Here's an example to illustrate:

locals {
  hash = "change this to whatever you want and reapply"
}

resource "random_string" "node" {
  length  = 3
  lower   = true
  special = false
  numeric = false
  upper   = false

  keepers = {
    user_data = sha256(local.hash)
  }
}

resource "null_resource" "create" {
  triggers = {
    hash = local.hash
    node = random_string.node.id
  }

  provisioner "local-exec" {
    when    = create
    command = "echo create ${self.triggers.hash} ${self.triggers.node}"
  }

  lifecycle {
    create_before_destroy = true
  }

  depends_on = [
    random_string.node,
  ]
}

resource "null_resource" "destroy" {
  triggers = {
    hash = local.hash
    node = random_string.node.id
  }

  provisioner "local-exec" {
    when    = destroy
    command = "echo destroy ${self.triggers.hash} ${self.triggers.node}"
  }

  lifecycle {
    # comment line below to see the difference
    create_before_destroy = true
  }

  depends_on = [
    random_string.node,
    null_resource.create,
  ]
}

And the output:

random_string.node: Creating...
random_string.node: Creation complete after 0s [id=kjc]
null_resource.create: Creating...
null_resource.create: Provisioning with 'local-exec'...
null_resource.create (local-exec): Executing: ["/bin/sh" "-c" "echo create change this to whatever you want and reapply kjc"]
null_resource.create (local-exec): create change this to whatever you want and reapply kjc
null_resource.create: Creation complete after 0s [id=3429720091763885346]
null_resource.destroy: Creating...
null_resource.destroy: Creation complete after 0s [id=948398361774632729]
null_resource.destroy (deposed object d889f24e): Destroying... [id=5541652494173857564]
null_resource.destroy: Destruction complete after 0s
null_resource.create (deposed object ad4d07d0): Destroying... [id=169389285284865921]
null_resource.create: Destruction complete after 0s
random_string.node (deposed object 892cf9f8): Destroying... [id=jtq]
random_string.node: Destruction complete after 0s

The server (null_resource) is first created and only after that previous server is destroyed. Too bad the command does not run.

On the other hand, if I comment out create_before_destroy = true in null_resource.destroy, my server is destroyed:

null_resource.destroy: Destroying... [id=948398361774632729]
null_resource.destroy: Provisioning with 'local-exec'...
null_resource.destroy (local-exec): Executing: ["/bin/sh" "-c" "echo destroy change this to whatever you want and reapply kjc"]
null_resource.destroy (local-exec): destroy change this to whatever you want and reapply kjc
null_resource.destroy: Destruction complete after 0s
random_string.node: Creating...
random_string.node: Creation complete after 0s [id=xtp]
null_resource.create: Creating...
null_resource.create: Provisioning with 'local-exec'...
null_resource.create (local-exec): Executing: ["/bin/sh" "-c" "echo create change this to whatever you want and reapply2 xtp"]
null_resource.create (local-exec): create change this to whatever you want and reapply2 xtp
null_resource.create: Creation complete after 0s [id=2523110283929799885]
null_resource.destroy: Creating...
null_resource.destroy: Creation complete after 0s [id=5905504673339125500]
null_resource.create (deposed object fabd7d77): Destroying... [id=3429720091763885346]
null_resource.create: Destruction complete after 0s
random_string.node (deposed object e916dd0f): Destroying... [id=kjc]
random_string.node: Destruction complete after 0s

Too bad I've lost all my data.

How is that not a valid use case?

shizzit commented 1 year ago

It seems absolutely bonkers that this has been an issue for so long. Something like what @kidmis said here would be perfect here IMO. I'm experiencing this on the latest release (v1.5.4 as of writing)

akamac commented 11 months ago

We need this to safely update instance types for EKS managed node groups, which cannot be done in-place and forces re-creation.

VickyPenkova commented 11 months ago

Is there any update here? Using the latest version of terraform, I am hitting the same and the resource I'm changing gets created with the new config, but the old one is never destroyed. That scenario happens for route53 records and it's essential functionality for us to have as we end up with duplicate records.

giner commented 8 months ago

What if instead of supporting only destroy for provisioner there would be a separate resource supporting the whole resource lifecycle similar to how provider external works for data? I guess this would cover most if not all cases.

jbardin commented 8 months ago

@giner, yes a separate managed resource is often the preferred solution here. Either a custom one which suits your use case, or something more generic like “external” which runs different commands at various points in the resource instance lifecycle. I think there are some existing examples in the public registry.

github-actions[bot] commented 3 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.