telia-oss / terraform-aws-ecs-fargate

Terraform module which creates Fargate ECS resources on AWS.
https://registry.terraform.io/modules/telia-oss/ecs-fargate/aws
MIT License
83 stars 73 forks source link

Tasks come up in unhealthy state due to lack of Ingress security group rules #57

Open aric49 opened 2 years ago

aric49 commented 2 years ago

Bug report

What is the problem? Perhaps I am missing something, but it appears that when deploying FarGate services using this module they will come up in an "unhealthy" state due to target group healthchecks timing out. Upon further investigation, it appears that the default security group created with the Fargate services does not have any "ingress" rules, only egress rules:

From main.tf:

# ------------------------------------------------------------------------------
# Security groups
# ------------------------------------------------------------------------------
resource "aws_security_group" "ecs_service" {
  vpc_id      = var.vpc_id
  name        = "${var.name_prefix}-ecs-service-sg"
  description = "Fargate service security group"
  tags = merge(
    var.tags,
    {
      Name = "${var.name_prefix}-sg"
    },
  )
}

resource "aws_security_group_rule" "egress_service" {
  security_group_id = aws_security_group.ecs_service.id
  type              = "egress"
  protocol          = "-1"
  from_port         = 0
  to_port           = 0
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
}

I think this can be resolved by creating an ingress security group rule for the container_port

Steps to reproduce

Please post the relevant parts of the failing terraform code here (remember to remove sensitive information):

module "fargate-service" {
  source  = "telia-oss/ecs-fargate/aws"
  version = "5.2.0"

  name_prefix          = "${terraform.workspace}-${var.container_service_name}"
  vpc_id               = data.aws_vpc.primary_vpc.id
  private_subnet_ids   = data.aws_subnet_ids.private.ids
  lb_arn               = data.aws_lb.primary_public.arn
  cluster_id           = data.aws_ecs_cluster.primary_ecs_cluster.arn
  task_container_image = var.container_image_tag
  desired_count        = var.application_instance_count[terraform.workspace]

  task_container_assign_public_ip = false

  task_container_port = var.application_port

  task_definition_cpu = var.application_cpu[terraform.workspace]

  task_definition_memory = var.application_memory[terraform.workspace]

  service_registry_arn              = resource.aws_service_discovery_service.application-sd.arn
  with_service_discovery_srv_record = false

  deployment_circuit_breaker = { "enable" : true, "rollback" : true }
  wait_for_steady_state      = true

  task_container_environment = var.application_environment_variables[terraform.workspace]

  health_check = {
    port = "traffic-port"
    path = var.application_healthcheck_path
  }

  tags = {
    Environment = terraform.workspace
    Terraform   = "True"
  }
}

Terraform version

Run terraform version and post the output here:

Terraform v1.0.5
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.56.0
+ provider registry.terraform.io/hashicorp/null v3.1.0
larstobi commented 2 years ago

Hi, Aric! Thanks for taking the time to report this.

You're right, the module doesn't open up any ingress traffic. This has primarily been a security design decision. We don't want to open any ports without it explicitly being stated.

However, I think you're right that the module should support doing it, if it is defined to do so, but default to no ingress openings.

If you have time to make a Pull Request, I'll review it. If not, we will see when we have time to make it happen.