Autoscaling group already exists after failure due to AWS limits

dmikalova commented 8 years ago

Terraform Version

0.7.4

Affected Resource(s)

aws_autoscaling_group

Terraform Configuration Files

resource "aws_launch_configuration" "mod" {
  name_prefix = "${var.tags["product"]}-${var.tags["env"]}-${var.tags["service"]}-${var.tags["component"]}-${var.lc_ami_timestamp}-"

  image_id             = "${var.lc_ami_id}"
  instance_type        = "${var.lc_instance_type}"
  iam_instance_profile = "${var.lc_iam_instance_profile_id}"
  key_name             = "${var.lc_key_name}"
  security_groups      = ["${split(",", var.lc_security_groups)}"]

  user_data         = "${var.lc_user_data_file}"
  enable_monitoring = false

  root_block_device {
    volume_size = "${var.lc_root_block_device_volume_size}"
    volume_type = "${var.lc_root_block_device_volume_type}"
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_autoscaling_group" "mod" {
  # This causes LCs and ASGs to stay in sync.
  name = "${aws_launch_configuration.mod.name}"

  availability_zones   = ["${var.asg_availability_zones}"]
  vpc_zone_identifier  = ["${var.asg_subnet_ids}"]
  launch_configuration = "${aws_launch_configuration.mod.id}"
  min_size             = "${var.asg_min_size}"
  max_size             = "${var.asg_max_size}"

  termination_policies = ["${split(",", var.asg_termination_policies)}"]
  load_balancers       = ["${split(",", var.asg_elb_names)}"]

  lifecycle {
    create_before_destroy = true
  }
}

Debug Output

* aws_autoscaling_group.mod: Error creating AutoScaling Group: AlreadyExists: AutoScalingGroup by this name already exists - A group with the name once-dev-playback-nginxplus-20161005Z031807-20161011214752196348605jbs already exists

Important Factoids

These are create before destroy, with the ASG name being set by the launch configuration. The desired count is never reached because AWS instance type limits for this region were being hit.

Expected Behavior

First run:

Terraform successfully creates LC.
Terraform creates ASG, and stores the fact that it was created.
Terraform waits for ASG to reach desired count.
Terraform fails because desired count is never reached.

Second run:

Terraform removes the ASG that was created if it still has not reached desired count.
Steps 2-3 above.
If the limit was lifted, success, if not, fail again.

No manual intervention is necessary.

Actual Behavior

First run:

Terraform successfully creates LC.
Terraform creates the ASG.
Terraform waits for ASG to reach desired count.
Terraform fails because desired count is never reached.

Second run:

Terraform fails because it attempts to create another ASG with the same name as above.

If the limit is lifted, manual intervention is necessary to remove the old ASG - terraform forgot about the ASG that it created and leaves behind cruft.

Steps to Reproduce

terraform apply to create a create before destroy ASG with same name as its LC and prevent the desired count from being reached.

tamsky commented 7 years ago

I have encountered the same outcome as reported in this bug, but in terraform v0.8.8. Mine occurs without bumping against AWS quotas or limits.

Mine occurs after the wait_for_elb_capacity phase of ASG-creation encounters a timeout, and terraform apply fails. A conflictingly named ASG remains.

I'm using the same naming scheme:

resource "aws_autoscaling_group" "mod" {
  name = "${aws_launch_configuration.mod.name}"
...
resource "aws_launch_configuration" "mod" {
  name_prefix =  "something"
...

I plan to try this same setup, but in v0.9.x. Will report back here.

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

hashicorp / terraform