hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.6k stars 9.55k forks source link

Error when instance changed that has EBS volume attached #2957

Closed bloopletech closed 7 years ago

bloopletech commented 9 years ago

This is the specific error I get from terraform:

aws_volume_attachment.admin_rundeck: Destroying...
aws_volume_attachment.admin_rundeck: Error: 1 error(s) occurred:

* Error waiting for Volume (<vol id>) to detach from Instance: <instance id>
Error applying plan:

3 error(s) occurred:

* Error waiting for Volume (<vol id>) to detach from Instance: <instance id>
* aws_instance.admin_rundeck: diffs didn't match during apply. This is a bug with Terraform and should be reported.
* aws_volume_attachment.admin_rundeck: diffs didn't match during apply. This is a bug with Terraform and should be reported.

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

We are building out some infrastructure in EC2 using terraform (v0.6.0). I'm currently working out our persistent storage setup. The strategy I'm planning is to have the root volume of every instance be ephemeral, and to move all persistent data to a separate EBS volume (one persistent volume per instance). We want this to be as automated as possible of course.

Here is a relevant excerpt from our terraform config:

resource "aws_instance" "admin_rundeck" {
  ami = "${var.aws_ami_rundeck}"
  instance_type = "${var.aws_instance_type}"
  subnet_id = "${aws_subnet.admin_private.id}"
  vpc_security_group_ids = ["${aws_security_group.base.id}", "${aws_security_group.admin_rundeck.id}"]
  key_name = "Administration"

  root_block_device {
    delete_on_termination = false
  }

  tags {
    Name = "admin-rundeck-01"
    Role = "rundeck"
    Application = "rundeck"
    Project = "Administration"
  }
}

resource "aws_ebs_volume" "admin_rundeck" {
  size = 500
  availability_zone = "${var.default_aws_az}"
  snapshot_id = "snap-66fc2258"
  tags = {
    Name = "Rundeck Data Volume"
  }
}

resource "aws_volume_attachment" "admin_rundeck" {
  device_name = "/dev/xvdf"
  instance_id = "${aws_instance.admin_rundeck.id}"
  volume_id = "${aws_ebs_volume.admin_rundeck.id}"

  depends_on = "aws_route53_record.admin_rundeck"

  connection {
    host = "admin-rundeck-01.<domain name>"
    bastion_host = "${aws_instance.admin_jumpbox.public_ip}"
    timeout = "1m"
    key_file = "~/.ssh/admin.pem"
    user = "ubuntu"
  }

  provisioner "remote-exec" {
    script = "mount.sh"
  }

  provisioner "remote-exec" {
    inline = [
      "sudo mkdir -m 2775 /data/rundeck",
      "sudo mkdir /data/rundeck/data /data/rundeck/projects && sudo chown -R rundeck:rundeck /data/rundeck",
      "sudo service rundeckd restart"
    ]
  }
}

And mount.sh:

#!/bin/bash

while [ ! -e /dev/xvdf ]; do sleep 1; done

fstab_string='/dev/xvdf /data ext4 defaults,nofail,nobootwait 0 2'
if grep -q -F -v "$fstab_string" /etc/fstab; then
  echo "$fstab_string" | sudo tee -a /etc/fstab
fi

sudo mkdir -p /data && sudo mount -t ext4 /dev/xvdf /data

As you can see, this:

This works fine the first time it's run. But any time we:

Terraform then tries to detach the extant volume from the instance, and this task fails every time. I believe this is because you are meant to unmount the ebs volume from inside the instance before detaching the volume. The problem is, I can't work out how to get terraform to unmount the volume inside the instance before trying to detach the volume.

It's almost like I need a provisioner to run before the resource is created, or a provisioner to run on destroy (obviously https://github.com/hashicorp/terraform/issues/386 comes to mind).

This feels like it would be a common problem for anyone working with persistent EBS volumes using terraform, but my googling hasn't really found anyone even having this problem.

Am I simply doing it wrong? I'm not worried about how I get there specifically, I just would like to be able to provision persistent EBS volumes, and then attach and detach that volume to my instances in an automated fashion.

jarias commented 9 years ago

Having the same issue here.

febbraro commented 9 years ago

I'm also having this issue. I have to detach the volume manually in the AWC Console for Terraform to complete my apply operation.

tobyclemson commented 9 years ago

I too am having this problem. Would it be enough to destroy the instance rather than trying to destroy the volume association?

danabr commented 9 years ago

We're also having the same issue.

One solution is to stop the instance that has mounted the volume before running terraform apply. From the AWS CLI documentation: "Make sure to unmount any file systems on the device within your operating system before detaching the volume. Failure to do so results in the volume being stuck in a busy state while detaching."

This might be what we are seeing here.

james-s-nduka commented 9 years ago

This bug has become quite critical to us. Is anyone looking into this currently?

Pryz commented 9 years ago

Same issue here. Any update ? Thanks

danabr commented 9 years ago

One solution would be to stop the associated instance before removing the volume attachment. Perhaps this is to intrusive to do automatically, though.

ryedin commented 9 years ago

same issue... and I don't think udev helps here (does udev publish an event when a device is attempting to detach?)

EDIT: tried adding force_detach option... no dice

bitoiu commented 9 years ago

Same issue here :cry:

JesperTerkelsen commented 9 years ago

I guess terraform should order terminating instances before removing attachments, by default on a full terraform destroy ?

simonluijk commented 9 years ago

@JesperTerkelsen As long as your application can shutdown gracefully within the 20 seconds given by AWS that makes sense.

nimbusscale commented 9 years ago

Me too!

j0nesin commented 9 years ago

I also needed to persist ebs volumes between instance re-creates and experienced this problem when trying to use volume_attachments. My workaround solution is to drop the "aws_volume_attachment"s and have each instance use the aws cli at bootup time to self-attach the volume it is paired with. When the instance is re-created terraform first destroys the instance which detaches the volume and makes it available for the next instance coming up.

In the instance user-data include the following template script elasticsearch_mount_vol.sh

INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id`

# wait for ebs volume to be attached
while :
do
    # self-attach ebs volume
    aws --region us-east-1 ec2 attach-volume --volume-id ${volume_id} --instance-id $INSTANCE_ID --device ${device_name}

    if lsblk | grep ${lsblk_name}; then
        echo "attached"
        break
    else
        sleep 5
    fi
done

# create fs if needed
if file -s ${device_name} | grep "${device_name}: data"; then
    echo "creating fs"
    mkfs -t ext4 ${device_name}
fi

# mount it
mkdir ${mount_point}
echo "${device_name}       ${mount_point}   ext4    defaults,nofail  0 2" >> /etc/fstab
echo "mounting"
mount -a
resource "aws_ebs_volume" "elasticsearch_master" {
    count = 3
    availability_zone = "${lookup(var.azs, count.index)}"
    size = 8
    type = "gp2"
    tags {
        Name = "elasticsearch_master_az${count.index}.${var.env_name}"
    }
}

resource "template_file" "elasticsearch_mount_vol_sh" {
    filename = "${path.module}/elasticsearch_mount_vol.sh"
    count = 3
    vars {
        volume_id = "${element(aws_ebs_volume.elasticsearch_master.*.id, count.index)}"
        lsblk_name = "xvdf"
        device_name = "/dev/xvdf"
        mount_point = "/esvolume"
    }
}
resource "aws_instance" "elasticsearch_master" {
    count = 3
    ...
    user_data = <<SCRIPT
#!/bin/bash

# Attach and Mount ES EBS volume
${element(template_file.elasticsearch_mount_vol_sh.*.rendered, count.index)}

SCRIPT
}
jimconner commented 8 years ago

Same issue here - would be nice if terraform had support for 'deprovisioners' so that we could execute some steps (such as a shutdown -h now) before machine destruction is attempted. We did find that if we did a terraform taint on the instance before terraform destroy then the destruction is completed successfully, so we'll use that as a workaround for now.

jniesen commented 8 years ago

I have a related issue with instance and EBS volume. I think a solution to my problem my fix this as well. With version 0.6.3 when destroying it seems that the volume attachment is always destroyed before the instance.

consul_keys.ami: Refreshing state... (ID: consul)
aws_security_group.elb_sg: Refreshing state... (ID: sg-xxxx)
aws_ebs_volume.jenkins_master_data: Refreshing state... (ID: vol-xxxx)
aws_security_group.jenkins_sg: Refreshing state... (ID: sg-xxxx)
aws_instance.jenkins_master: Refreshing state... (ID: i-xxxx)
aws_elb.jenkins_elb: Refreshing state... (ID: jniesen-jenkins-master-elb)
aws_volume_attachment.jenkins_master_data_mount: Refreshing state... (ID: vai-xxxx)
aws_route53_record.jenkins: Refreshing state... (ID: xxxx)
aws_volume_attachment.jenkins_master_data_mount: Destroying...
aws_route53_record.jenkins: Destroying...
aws_route53_record.jenkins: Destruction complete
aws_elb.jenkins_elb: Destroying...
aws_elb.jenkins_elb: Destruction complete
Error applying plan:

1 error(s) occurred:

* aws_volume_attachment.jenkins_master_data_mount: Error waiting for Volume (vol-xxxx) to detach from Instance: i-xxxx

I thought that I could get around this by having a systemd unit stop the process using the mounted ebs volume and then unmount whenever the instance receives a halt or shutdown. The problem is that doesn't ever happen before the EBS volume destroy is attempted. I think if the order could be forced, and I could have the instance destroyed before the volume, things would go more smoothly.

j0nesin commented 8 years ago

If you use 'depends_on' in the instance definition to depend on the ebs volume, then the destroy sequence will destroy the instance before trying to destroy the volume.

jniesen commented 8 years ago

The error comes when destroying the volume_attachment which would cause the volume to just detach. I mis-spoke in my last paragraph. I can't make the instance depend on the attachment explicitly because the attachment already depends on the instance implicitly because I'm referencing the instances id.

james-masson commented 8 years ago

+1 agree with @jniesen

A persistent data disk, separate from OS/instance would be a great feature, if it worked!

Creation of related aws_ebs_volume, aws_instance and aws_volume_attachment resources work fine.

Any apply that involves the re-creation of the aws_instance hangs, as the aws_volume_attachment implicitly depends on the aws_instance references , and is destroyed first - causing the volume unmount to hang.

For this to work in an elegant fashion, the VM would have to be destroyed first, to get a clean unmount.

opokhvalit commented 8 years ago

got the same problem. Workaround with taint+debug is work fine, thanks @jimconner

ghost commented 8 years ago

+1 to a fix. If the attached EBS volume is in use by the OS by say a daemon process (e.g., Docker) then some mechanisms has to be provided by Terraform to allow OS level calls for clean service stop and umount. Some of the ideas listed herein are possible works around, but not tenable long term solutions.

sudochop commented 8 years ago

+1 Same problem here. Thanks for the workaround @jimconner

arthurschreiber commented 8 years ago

I'm also running into this issue. If both the aws_instance as well as the linked aws_volume_attachment are scheduled to be deleted, the instance needs to be deleted first.

arthurschreiber commented 8 years ago

See #4643 for a similar problem, and the feature request in #622 which would provide an easy fix for this.

phinze commented 8 years ago

Hey folks, thanks for the reports and apologies for the trouble.

Restating and summarizing the above thread:

This is an interesting declarative modelling problem: we separated out aws_volume_attachment as its own resource which is a strategy we've consistently taken to declaratively model links between two resources (aws_instance and aws_ebs_volume) in this case.

Terraform's dependency graph currently includes the assumption that destroy order should be strictly reverse to that of create order.

So as you all have noted the create order (along with the equivalent AWS API calls) is:

  1. Create EC2 instance (ec2.RunInstances)
  2. Create EBS volume (ec2.CreateVolume)
  3. Attach EBS volume to instance (ec2.AttachVolume)

And the destroy order is reversed:

  1. Detach EBS volume from instance (ec2.DetachVolume)
  2. Destroy EBS volume (ec2.DeleteVolume)
  3. Destroy EC2 instance (ec2.TerminateInstances)

Generally speaking this is what you'd want, and it works well for other resources, but in this case since the volume is mounted in the instance we end up with problems calling ec2.DetachVolume.

Options for solutions include:

A. Add core support for selectively re-ordering the dependency graph, so the instance is destroyed before the attachment B. Rework the ec2.AttachInstance stuff back inside the aws_instance resource (per #622) C. Support provisioners that run on destroy so the attachment can be unmounted as it is destroyed (#386)

So (A) would involve a bunch tricky core work that I'm worried would be too delicate to be worthwhile, (B) loses the benefits we get from having a separate resource, so I think the best move here is going to be (C) - supporting "deprovisioners" in the config.

Does that sound like a reasonable approach?

holybit commented 8 years ago

@phinze Agree C is best so long as deprovision is flexible (e.g., stop service, umount, etc.).

opokhvalit commented 8 years ago

@phinze Disagree C, because volume freeing from inside of instance may be too complicated. I think "deprovisioners" is good idea at itself anyway, but they may be not a appropriate solution in this case.

arthurschreiber commented 8 years ago

I think solution A will be the most flexible long term. I can imagine similar cases might crop up in other terraform resources as well, even if adding the feature initially will be quite a lot of work.

tamsky commented 8 years ago

A failure of ec2.DetachVolume followed by a failure of ec2.DeleteVolume does not feel like a critical failure, assuming the intent is to run ec2.TerminateInstances -- an action which should succeed regardless of state of the volume attachment.

Would there be any benefit to this arrangement: • failures in DetachVolume are ignored • failures in DeleteVolume throw an exception • the exception is caught in TerminateInstances which causes DeleteVolume to be called a 2nd time iff TerminateInstances succeeds

james-masson commented 8 years ago

C. is unworkable in many environments, as not all VMs can be contacted by direct provisioners. Eg. VMs in isolated subnets. Don't assume that Terraform can directly contact every VM it creates.

Personally I'd prefer A or B.

redbaron commented 8 years ago

@phinze , that is an interesting observation, based on that there can be another solution which doesn't require deep rework of dependency and start/stop logic. It seems that all is needed is to have ec2.RunInstances step last in the sequence. Only possible change required is for plan executor to be able to detect that op is a no-op and don't try to execute it (like running instance which is already running).

  1. Create EC2 instance (ec2.RunInstances)
  2. Create EBS volume (ec2.CreateVolume)
  3. Attach EBS volume to instance (ec2.AttachVolume)
  4. Run EC2 instance (since it is already running should render in no-op operation)

Then reverse of this sequence will be:

  1. Stop EC2 instance
  2. Detach volume
  3. Destroy volume
  4. Stop EC2 instance (no-op)
danabr commented 8 years ago

@redbaron: That would be quite an elegant solution indeed.

cconstantine commented 8 years ago

I'm running into this problem too, and I would much prefer not C. B is not great; I really like that ebs volumes are separate resources. My preference would be for A or @redbaron's suggestion.

deviscalio commented 8 years ago

Have you a plan to fix this issue? For us it is blocking.

yakaas commented 8 years ago

Have the same issue, at the moment manually terminate/stop the instance before applying any changes. If we have a strategy sorted out would be happy to send a PR.

dbatwa commented 8 years ago

This is a big problem to me too. What's the go?

dvianello commented 8 years ago

Same problem here, we're mitigating the issue with the taint trick for the time being...

c4milo commented 8 years ago

Running into this one too.

mrwacky42 commented 8 years ago

:+1: to the approach suggested by @tamsky. If Terraform's plan is to destroy all these things, than DetachVolume/DeleteVolume failures are not fatal results.

rgabo commented 8 years ago

@tamsky's approach would work for our use case too where the persistent EBS volumes survive when the instance is recycled. We resort to stopping or terminating the instance manually before applying the Terraform plan.

jtopper commented 8 years ago

I'm still bumping up against this. Use of an EBS volume to hold state whilst instances themselves can be recycled is a really common pattern for us.

Gary-Armstrong commented 8 years ago

It is interesting that I can make it go away by simply tainting the instance.

Jonnymcc commented 8 years ago

Interesting, I've upgraded to Terraform v0.6.15 and the tainting the instance work around no longer works. Even when the instance is marked as tainted terraform still tries to remove the aws_volume_attachment before destroying the instance.

My thoughts on solutions... I believe it would not be ok to ignore a failure to detach a volume, if a resource failed to be deleted I would like it to error and let me know asap. A provisioner that allows executing code on the resource before destroying sounds neat, but I wouldn't want to set that up for every aws_instance resource that has ebs volumes attached to it. I like @phinze option A, maybe it doesn't need to involve reworking the dependency graph but it would be nice to at least say if I'm destroying resource A destroy resources [b, c, d] first.

Another option that I think hasn't been suggested is adding a core retry attribute to resources, so that if vol_attachment resource A had this attribute set to true it would try to detach and timeout out because the aws_instance was not deleted first and this timeout is ignored and vol_attachment A is added to a cleanup queue where after all other resources have been successfully destroyed terraform then goes through the cleanup queue and tries destroying those resources again.

So, vol A (times out, gets added to queue) -> destroy instance B (which A was attached to) -> read the queue, find A and try destroying it again (which should succeed if B was destroyed).

maxenglander commented 8 years ago

I agree with the view that, as voiced by @tamsky, a good way to deal with this issue is to change how aws_volume_attachment behaves during the destroy phase, as long as doing so is the user's explicit intent.

However, rather than configuring aws_volume_attachment to silently ignore failures when invoking the AWS DetachVolume API call during the destroy phase, I would instead suggest adding an optional attribute that prevents this API call from being invoked at all.

Given that the termination of AWS instances implies the detachment of any associated EBS volumes (subtly hinted, I think, by @tobyclemson), and that attempting to explicitly detach volumes is the very thing preventing users affected by this issue from achieving either, it seems to me that an option to disable explicit detachment of volumes in aws_volume_attachment is a succint and non-invasive way to unblock the termination (and re-creation) of instances.

Implementation-wise, this could be accomplished by adding a boolean attribute named explicit_detach (defaults to true) to aws_volume_attachment such that, when set to false, the destroy phase merely removes the resource from Terraform state without attempting an AWS DetachVolume API call. (It would also make sense to change the create phase to not attempt to attach V to I when V is already attached to I.)

This approach would resolve this issue without breaking workflows for users who rely on the current behavior, without the aws_volume_attachment resource needing to be aware of the context in which it is being destroyed, without silently ignoring any API call failures, without requiring any change to core, and without having to introduce de-provisioners.

LeslieCarr commented 8 years ago

As another data point, on 0.7.0, tainting a resource no longer works.

Has anyone found a workaround for .7 ?

Jonnymcc commented 8 years ago

@maxenglander if explicit_detach was set to true and the volume was not being deleted but instead reattached to another instance what would happen? If the volume would be successfully attached to the new instance I'm wondering why we need attachment delete at all. To keep track of a partially detached state in the state file?

charity commented 8 years ago

I pretty strongly feel that terraform is the wrong way to solve this problem.

I wouldn't use terraform for cfg management -- it's not built for that, you're just gonna get in to trouble. It's also not built for keeping track of the state in stateful services and editing or revising existing resources -- it's not built for that either. I would not want to run terraform to detach and reattach EBS volumes for a whole pile of reasons. I would create the EC2 instances with TF -- probably in an ASG so I can control rolling them -- and probably create the EBS volumes with TF too, so they get tagged and tracked correctly.

But anything that happens past that -- formatting, installing software, etc should really be done by something else. I guess I generally think that reusing volumes in this way is an antipattern period -- I prefer to start with fresh resources and have a way of syncing / initializing data -- but if you are reusing them in this way, I feel like it should be done by an offline process. Probably a script that doesn't save any kind of state but just takes arguments (whether that's instance ids, roles, whatever) and does all the detach, retries, reattachment actions. Something with this much ordering belongs in a script, not a tfstate.

There are plenty of things that you can do with TF that will just be really painful and timeouty and errory and will lead you to hate your life, I think this is one of them. :)

LeslieCarr commented 8 years ago

I have to partially agree and partially disagree.

I don't think that terraform should format, install, mount, etc volumes. I do need to have some persistant storage (and an EBS volume is perfect) for a number of instances. We're talking non-root attached EBS volumes. Terraform has the resource of aws_volume_attachment to attach the volume, and it would be nice if after a volume is attached to be able to delete the instance that the volume is attached to and be able to reuse the volume.

If I can't, I'd have to have a completely separate system (maybe cloudformation? homegrown shell scripts?) for any machines which attach to an EBS volume and that system would have to store the state of the machines, and state and information of the ebs volumes... just like terraform.

I think if the resource aws_volume_attachment exists, it needs to be fixed (you have to be able to delete the instance). If this isn't going to be fixed, I think we should remove aws_volume_attachment, so that it's obvious that you can't attach a volume and then delete an instance.

luckymike commented 8 years ago

I agree with @LeslieCarr.

While I personally think that long-lived EBS volumes that move between instances are a painful, compromise solution that rarely address the root issues of data-persistence, to the extent that there are many scenarios where people choose to do this, I absolutely expect my provisioner to be capable of managing those.

maxenglander commented 8 years ago

@Jonnymcc In my clone, I ended up calling this skip_detach (the inverse of explicit_detach).

if explicit_detach was set to true and the volume was not being deleted but instead reattached to another instance what would happen?

When skip_detach is not set, or is set to false (the equivalent of explicit_detach being true), what happens is no different than the current behavior: AWS fails to detach the volume due to the mount, and the Terraform execution fails. When set to true, this problem goes away.

If the volume would be successfully attached to the new instance I'm wondering why we need attachment delete at all. To keep track of a partially detached state in the state file?

I honestly don't know why the current behavior is useful for anyone, but I didn't feel comfortable saying that it isn't or couldn't be for some users. So, I thought that the best solution was to add a field (skip_detach) that would not, by default, have any impact on these speculative users, and (when set to true) a lot of beneficial impact for users like me who are nettled by the current behavior.

In my imagination, there's some Terraformer out there regularly detaches and re-attaches volumes between running instances as part of some unusual warehousing process, and would, rightly, not want the default value of skip_detach to change the current destroy behavior of aws_volume_attachment.

charity commented 8 years ago

A provisioner is not a manager. It's actually not reasonable to expect your provisioner to handle a bunch of ordered multiphase state changes between multiple components. Esp when persistent state and mounted volumes are involved.

And I super specifically sketched out a solution that did not involve saving another state elsewhere, because I agree, that would be dumb. I would probably use tags, defined by tf, and a rolling script to resolve/pair those tags by detaching / reattaching.

I can see where you're coming from conceptually: if tf has an aws_volume_attachment, you kinda expect it to have an explicit aws_volume_detachment command. But detaching is harder and has a different set of dependencies that TF can't perform to help prevent your data from corruption and your nodes from hanging. TF can't check to make sure no processes are holding files open or writing to the volume, safely unmount it, etc.

I was asked to comment on this but I think I'm giving advice that's more abstract/best practicesy than is appropriate for a TF github ticket, so I can back out of the convo. :)

Just saying, if i was a TF maintainer, i would be cringing and deprioritizing this just thinking of all the tickets that are gonna be opened in panic and anger if i wrote this. It's just a bad way to manage state, shit's gonna get stuck and corrupted a lot without host-level visibility and safeguards and retries and exception handling, which tf can't and shouldn't try to do.