hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.51k stars 9.52k forks source link

Ignore the fact that an object has been deleted #29245

Open BrandonALXEllisSS opened 3 years ago

BrandonALXEllisSS commented 3 years ago

Terraform Version:

 Terraform v0.15.0-beta1

Terraform Configuration Files

#Instance that deletes itself
resource "aws_instance" "script_instance" {
  provisioner "remote-exec" {
    inline = [
       "aws ec2 terminate-instances --instance-ids ${self.id} --region ${data.aws_region.current.name}"
    ]
  }
  lifecycle {
    ignore_changes = all
  }
}

Expected Behavior

When the instance is deleted from outside terraform, the next terraform runs will not try to bring back the resource

Actual Behavior

Subsequent terraform runs will bring back the deleted resource

Steps to Reproduce

  1. terraform init
  2. terraform apply
  3. Wait about 2 minutes
  4. 'terraform plan' -- will try to recreate the instance

If this truly is intended behavior, it would be nice to have a lifecycle option that did implement this.

apparentlymart commented 3 years ago

Hi @BrandonALXEllisSS! Thanks for sharing this.

The design of ignore_changes is to ignore future changes in the configuration and not plan to update the remote system to respect them. An unintended but inevitable consequence of this is that in effect Terraform also ignores situations where the remote system changed to no longer match the configuration, because Terraform doesn't have a third artifact to compare the configuration and the remote system to in order to decide which one has changed, and so it just assumes that it was the configuration that changed if they don't match.

However, there's no situation where ignore_changes could apply to removing a resource because removing a resource from the configuration would also remove the ignore_changes argument, and thus effectively disable it.

It sounds like you've interpreted ignore_changes as "ignore changes in the remote system", which is a common interpretation of a historically-badly-named option, but indeed it is behaving as intended here because what you've encountered is a change in the remote system that you want to ignore, not a change in your configuration that you want to ignore.

With that said then, I'm going to relabel this as an enhancement so we can treat it as discussion of a new use-case that we might find some new design to solve.


With that in mind, I'd love to hear a little more about the underlying reason for the behavior you're looking for here. As far as I know we've not seen any previous request for Terraform to ignore that an object doesn't exist, and so I'd like to understand what sorts of situations that would be useful in, which might then help us to design a feature to meet it.

In particular, one problematic part of what you described here is the case where the object never existed in the first place. If we had some way to say "ignore this object not existing" then on the first terraform apply Terraform would presumably not plan to create that object in the first place, and so it would never exist and thus it would be pointless to have it in the configuration. I assume you have something more subtle in mind than just "ignore the object not existing", which I think would be easier to understand in terms of an underlying need that this would be one possible solution to.

Thanks!

BrandonALXEllisSS commented 3 years ago

My use case is as follows: I am building a VPC which is not accessible from the outside through any other means than using Amazon Workspaces (for security reasons). However, I provisioned a service in the VPC that I need to bootstrap through various API calls... somehow from my local machine...

I can't just "local-exec" some calls to the VPC, so I need some other way of reaching the resources within.

My original approach was to make something like an instance or lambda function which I would ssh into or trigger once with a local-exec and delete thereafter since it's no longer needed. (Though now that I think of it, an instance wouldn't work if there's no SSH endpoint). (And keeping lambda functions hanging around doesn't exactly add up costs either...).

Anyways, the idea is basically to make a resource act the same as a null_resource local-exec with no triggers, where the resource just deploys once and doesn't move until it is destroyed from the state. Essentially take CRUD, remove the RUD part, and just keep the "C". It really breaks a lot of terraform's principles now that I think about it. This same kind of functionality could be replicated by just using null_resource local-execs that create, trigger, and destroy these resources. But it is handy having terraform's providers to create the resource with proper error handling for you

apparentlymart commented 3 years ago

Thanks for sharing those details, @BrandonALXEllisSS.

Indeed, it does seem like what you are aiming at here is outside of Terraform's typical scope. That doesn't mean we won't consider some possibilities for dealing with it, but I want to be up front that we're unlikely to prioritize it in the near future because our resources are limited and so of course we tend to prioritize requests that fit more within Terraform's intended use.

If I'm understanding your scenario correctly, I think you could in principle get the result you need with today's Terraform by splitting your problem into two Terraform configurations. The first one would deal with the "transient" resources that you only need during your bootstrapping process, while the second one would deal with the longer-lived resources. You could then either manually run or automate a sequence of steps to get bootstrapped:

  1. Apply the transient configuration
  2. Apply the long-lived configuration
  3. Destroy the transient configuration

Then in principle moving forward you can just work with the long-lived configuration, although that'll only be true if that configuration is written in such a way that it can tolerate the transient objects being absent after initial creation. The details of how exactly that could work I'm not sure about, since I think that gets into the specifics of your situation, but it might involve an extra step "1a" in the sequence above where something outside of Terraform captures some relevant output values from the transient configuration and saves them somewhere more persistent (e.g. SSM Parameter Store) so that they can outlive the transient configuration that generated them.

venkivijay commented 1 year ago

This would be useful when using aws_eip. My use-case is as follows: I create an AWS EC2 instance with AWS Elastic IP (EIP). I also have a start stop system on my infra which stops EC2 instance at night. Since dissociated EIP is charged, I have a lambda that will remove this EIP before shutting down the instance. And the same lambda will also create a new EIP when the instance is started in the morning. Since the EIP created by terraform was removed outside of terraform, it will create a replacement resource and attach it to the instance on the next run. I cannot avoid the EIP creation (via terraform) as it would introduce manual work after the initial run.

abrockmeyer-govtact commented 9 months ago

I also have a case for this. I am deploying a blue/green codedeploy deployment with AutoScaling group behind a load balancer. When the blue/green deployment is finished, it terminates the old ASG. When I run a new pipeline to deploy the code again (we use jenkins) it tries to recreate the resource. This is not what I want to happen. Rather than having additional updates to the code or managing something outside of terraform, it would be nice to have ignore_changes = ["deleted"] or something to indicate I don't care if this resource no longer exists after the first creation event.