cloudposse / terraform-aws-dynamic-subnets

Terraform module for public and private subnets provisioning in existing VPC
https://cloudposse.com/accelerate
Apache License 2.0
197 stars 167 forks source link

Auto-heal NAT Instances #96

Closed osterman closed 2 years ago

osterman commented 4 years ago

We should auto-heal our nat instances. Also, we should probably move NAT to it's own module and support gateways and instances interchangeably so we can re-use across our modules.

# Restart dead bastions and alert
resource "aws_cloudwatch_metric_alarm" "bastion" {
  alarm_name          = "${module.label.id}-status-check-failed"
  comparison_operator = "GreaterThanOrEqualToThreshold"
  evaluation_periods  = "5"
  metric_name         = "StatusCheckFailed_Instance"
  namespace           = "AWS/EC2"
  period              = "60"
  statistic           = "Maximum"
  threshold           = "1"

  dimensions {
    InstanceId = "${aws_instance.nat_instance.id}"
  }

  alarm_actions = [
    "arn:aws:swf:${var.aws_region}:${local.aws_account_id}:action/actions/AWS_EC2.InstanceId.Reboot/1.0",
  ]
}
Nuru commented 2 years ago

@osterman NAT Instances are deprecated and do not support NAT64. I am therefore not inclined to add additional support for them. Can we close this, or do you want to pursue it further?

Nuru commented 2 years ago

The only reason to create NAT Instances instead of using NAT Gateways is to save money, currently on the order of $20/month. It is not worth the time and effort to further enhance this module's NAT Instance support beyond what is added in #159. Cloud Posse can create a separate NAT Instance module if there is demand for it, but it seems unlikely to me that the expense of supporting NAT Instances would not outweigh the cost savings they bring.

Our recommended cost-saving solution going forward is to use a single NAT Gateway rather than one per region. For a typical installation of 3 regions, that is approximately budget neutral (one NAT Gateway costs about the same as 3 t3.micro NAT instances), costing a total of about US $30/month.

For better NAT instance support, people can use an alternate module such as https://github.com/int128/terraform-aws-nat-instance (we have not vetted it, just noticed it) to create NAT instances and easily connect them to the private subnets.

module "dynamic_subnets" {
  source  = "cloudposse/dynamic-subnets/aws"
  version = "2.0.0"

  nat_gateway_enabled  = false
  nat_instance_enabled = false
  # etc . . .
}

resource "aws_route" "private" {
  count = length(module.dynamic_subnets.private_route_table_ids)

  route_table_id         = module.dynamic_subnets.private_route_table_ids[count.index]
  destination_cidr_block = "0.0.0.0/0"
  network_interface_id   = element(local.nat_instances.*.primary_network_interface_id, count.index)
}