terraform-aws-modules / terraform-aws-autoscaling

Terraform module to create AWS Auto Scaling resources 🇺🇦
https://registry.terraform.io/modules/terraform-aws-modules/autoscaling/aws
Apache License 2.0
288 stars 552 forks source link

depends_on for success when asg+vpc #255

Closed ezekieldas closed 4 months ago

ezekieldas commented 6 months ago

Description

This is a very bare bones demo which builds a VPC, ASG, ALB using the noted modules. In ~80% of executions the ec2 instances provided by the ASG will fail in their user-data execution, presumably because of the NAT Gateway not being fully available prior to instances reaching a state of Running.

When depends_on is used against the vpc module, ec2 targets appear healthy 100% of executions. However, note the documentation for depends_on. I have not incorporated all of the detail or implications into my solution.

While this may not be a bug per-se I'd like to understand the issue better, and maybe get an explanation which might help others encountering the same issue. I was able to find similar, yet more complex, experiences searching issues here.

Versions

alb/aws 9.4.0 autoscaling/aws 7.3.1 vpc/aws 5.4.0

Reproduction Code [Required]

Steps to reproduce the behavior: terraform plan, terraform apply + add/remove depends_on


module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"
  name = "ddemo-vpc"
  cidr = "10.0.0.0/16"
  azs             = ["us-west-1b", "us-west-1c"]
  public_subnets = ["10.0.10.0/24", "10.0.20.0/24"]
  private_subnets = ["10.0.100.0/24", "10.0.200.0/24"]

  manage_default_route_table    = true
  enable_nat_gateway = true
  single_nat_gateway = false
}

module "asg" {
  source  = "terraform-aws-modules/autoscaling/aws"
  name = "ddemo-asg"
  instance_type = "t2.micro"
  image_id = "ami-014d05e6b24240371"
  min_size         = 2
  max_size         = 2
  desired_capacity = 2
  min_elb_capacity = 2
  vpc_zone_identifier = module.vpc.private_subnets
  user_data         = base64encode(local.user_data)
  security_groups = [module.alb.security_group_id]
  target_group_arns = [for k, v in module.alb.target_groups : v.arn]

  #
  # https://developer.hashicorp.com/terraform/language/meta-arguments/depends_on
  #
  # depends_on = [module.vpc]

}

module "alb" {
  source = "terraform-aws-modules/alb/aws"
  vpc_id  = module.vpc.vpc_id
  subnets = module.vpc.public_subnets
  enable_deletion_protection = false
  security_group_ingress_rules = {
    all_http_8080 = {
      from_port   = 8080
      to_port     = 8080
      ip_protocol = "tcp"
      description = "HTTP_8080"
      cidr_ipv4   = "0.0.0.0/0"
    }
    all_http = {
      from_port   = 80
      to_port     = 80
      ip_protocol = "tcp"
      description = "HTTP"
      cidr_ipv4   = "0.0.0.0/0"
    }
  }
  security_group_egress_rules = {
    all = {
      ip_protocol = "-1"
      cidr_ipv4   = "0.0.0.0/0"
    }
  }

  listeners = {
    ddemo-http = {
      port     = 80
      protocol = "HTTP"
      forward = {
        target_group_key = "ddemo-tg"
      }
    }
  }
  target_groups = {
    ddemo-tg = {
      name_prefix = "ddemo-"
      protocol    = "HTTP"
      port        = 8080
      target_type = "instance"
      create_attachment = false
      health_check = {
        port    = 8080
        path    = "/"
        matcher = "200"
      }
    }
  }
}

locals {
  user_data = <<-EOT
    #!/bin/bash
    set -e
    /usr/bin/apt-get update
    /usr/bin/apt-get install -y nginx
    sed -i 's/listen 80 default_server;/listen 8080 default_server;/g' /etc/nginx/sites-available/default
    echo "hello how are you." > /var/www/html/index.html
    /usr/bin/systemctl restart nginx
    echo "all done here!"
  EOT
}
github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

github-actions[bot] commented 4 months ago

This issue was automatically closed because of stale in 10 days

github-actions[bot] commented 3 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.