hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.58k stars 9.54k forks source link

aws_autoscaling_group: ELB healthy instances not counted #10645

Closed nicelyc closed 7 years ago

nicelyc commented 7 years ago

Terraform Version

v0.7.13

Affected Resource(s)

Terraform Configuration Files

resource "aws_autoscaling_group" "foo" {
  name = "foo"
  min_size             = "${var.min_server_count}"
  max_size             = "${var.max_server_count}"
  desired_capacity     = "${var.desired_server_count}"
  launch_configuration = "${aws_launch_configuration.foo.name}"
  target_group_arns    = ["${aws_alb_target_group.foo.arn}"]

  # Associate ASG with load balancer so it can count running ECS tasks.
  load_balancers = ["${aws_elb.foo.name}"]
  min_elb_capacity     = "${var.min_server_count}"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_elb" "foo" {
  name                        = "foo"
  connection_draining         = true
  connection_draining_timeout = 400
  internal                    = false
  subnets                     = ["${split(",", var.elb_subnets"]
  cross_zone_load_balancing   = true
  security_groups             = ["${aws_security_group.foo.id}"]

  listener {
    instance_port      = 80
    instance_protocol  = "http"
    lb_port            = 443
    lb_protocol        = "https"
    ssl_certificate_id = "${var.elb_certificate_id}"
  }

  listener {
    instance_port     = 80
    instance_protocol = "http"
    lb_port           = 80
    lb_protocol       = "http"
  }

  health_check {
    healthy_threshold   = "${var.elb_health_check["healthy_threshold"]}"
    unhealthy_threshold = "${var.elb_health_check["unhealthy_threshold"]}"
    timeout             = "${var.elb_health_check["timeout"]}"
    target              = "${var.elb_health_check["target"]}"
    interval            = "${var.elb_health_check["interval"]}"
  }
}

Debug Output

https://gist.github.com/nicelyc/5d0ece28477116ecae8299d27bb19e96

Panic Output

n/a

Expected Behavior

Autoscaling group creation should have completed after number of healthy instances reached minimum ELB capacity. This appears to be a regression, and was functional in 0.7.4.

Actual Behavior

Autoscaling group creation timed out and reported zero healthy instances in the ELB, despite the fact that ELB shows two healthy instances.

$ aws elb describe-instance-health --load-balancer-name foo-staging
{
    "InstanceStates": [
        {
            "InstanceId": "i-c806d4df", 
            "ReasonCode": "N/A", 
            "State": "InService", 
            "Description": "N/A"
        }, 
        {
            "InstanceId": "i-4dcc23df", 
            "ReasonCode": "N/A", 
            "State": "InService", 
            "Description": "N/A"
        }, 
        {
            "InstanceId": "i-7f18a6e7", 
            "ReasonCode": "Instance", 
            "State": "OutOfService", 
            "Description": "Instance has failed at least the UnhealthyThreshold number of health checks consecutively."
        }
    ]
}

(note: the third failed instance is expected, since we only have two tasks running)

Steps to Reproduce

  1. terraform apply (note, config above is partial)

Important Factoids

Autoscaling instances are ECS container instances. Health checks call exposed port on container instances. Tasks were running and responding on two of three instances.

References

n/a

nicelyc commented 7 years ago

This was due to a target group that did not have any ALB listeners (and therefore no health checks) associated with it.

MartinCerny-awin commented 5 years ago

@nicelyc I am having the same issue, could you add example how you solved it?

MartinCerny-awin commented 5 years ago

I have found out that my problem was with wrong Health check configuration. I've verified the health check manually on ec2 instance and it was not working as expected.

AndyBarnettCTM commented 5 years ago

I'm having the same prob...oh never mind fixed it like the others did

ghost commented 5 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.