hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.74k stars 9.1k forks source link

Can not destroy a SG if that SG is added in the default VPC SG #12443

Open gchek opened 4 years ago

gchek commented 4 years ago

Community Note

Terraform Version

`Terraform v0.12.23

Affected Resource(s)

Terraform Configuration Files

resource "aws_default_security_group" "default" {

  vpc_id = aws_vpc.vpc1.id

  ingress {
    description = "Default SG for VPC1"
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    self        = true
  }
  ingress{
     description = "Include EC2 SG in VPC1 default SG"
     from_port   = 0
     to_port     = 0
     protocol    = "-1"
     security_groups = [aws_security_group.GC-SG-VPC1.id]
   }
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  tags = {
    Name = "Default VPC1-SG"
  }
}

Expected Behavior

With terraform destroy the SG named GC-SG-VPC1 above should be removed from the default SG

Actual Behavior

Terraform stuck trying to delete GC-SG-VPC1 until I manually remove it from the default SG

Steps to Reproduce

  1. terraform destroy
brett-patterson commented 4 years ago

I am also running into this issue. I'm aware of the workaround of removing the security group rule manually, but this is cumbersome in our environment where manual access to these rules is restricted.

I came across the revoke_rules_on_delete attribute of aws_security_group that seems to do exactly what we would want (revoke rules before deletion). Although aws_default_security_group lets you set this, it doesn't seem to respect it in my testing. I think respecting this flag in aws_default_security_group would be a great way to keep the same default behavior and still allow people to opt-in to revoking rules when destroying default security groups.

gchek commented 4 years ago

Terraform v0.12.26

I confirm that revoke_rules_on_delete is NOT applied on default SG and I need to MANUALLY remove the included SG. I tried to add a depends_on but at destroy I see the default SG flagged as "destroyed" where it's still alive in AWS. This is very inconsistent. First the included SG must be destroyed and then the default SG - not the opposite.

module.VPCs.aws_default_security_group.default: Destroying... [id=sg-07a7f3715f230ff20]
module.VPCs.aws_default_security_group.default: Destruction complete after 0s

module.VPCs.aws_security_group.GC-SG-VPC-test: Destroying... [id=sg-0d3e40e2bf08ccb0f]
module.VPCs.aws_security_group.GC-SG-VPC-test: Still destroying... [id=sg-0d3e40e2bf08ccb0f, 10s elapsed]
module.VPCs.aws_security_group.GC-SG-VPC-test: Still destroying... [id=sg-0d3e40e2bf08ccb0f, 20s elapsed]
module.VPCs.aws_security_group.GC-SG-VPC-test: Still destroying... [id=sg-0d3e40e2bf08ccb0f, 30s elapsed]
module.VPCs.aws_security_group.GC-SG-VPC-test: Still destroying... [id=sg-0d3e40e2bf08ccb0f, 40s elapsed]
module.VPCs.aws_security_group.GC-SG-VPC-test: Still destroying... [id=sg-0d3e40e2bf08ccb0f, 50s elapsed]
module.VPCs.aws_security_group.GC-SG-VPC-test: Still destroying... [id=sg-0d3e40e2bf08ccb0f, 1m0s elapsed]

and so on until manual destroy

gchek commented 4 years ago

update: Terraform v0.12.29 provider.aws v3.3.0 Same issue Screenshot 2020-08-25 at 10 04 23

`/================ Create VPCs Create respective Internet Gateways Create subnets Create route tables create security groups =================/

variable "VPC-test_cidr" {} variable "Subnet10-VPC-test" {} variable "Subnet20-VPC-test" {} variable "region" {}

/================ VPCs =================/ resource "aws_vpc" "VPC-test" { cidr_block = var.VPC-test_cidr enable_dns_support = true enable_dns_hostnames = true tags = { Name = "VPC-test" } }

/================ IGWs =================/

resource "aws_internet_gateway" "VPC-test-igw" { vpc_id = aws_vpc.VPC-test.id tags = { Name = "VPC-test-IGW" }

}

/================ Subnets in VPC-test =================/

/Get Availability zones in the Region/ data "aws_availability_zones" "AZ" {}

resource "aws_subnet" "Subnet10-VPC-test" { vpc_id = aws_vpc.VPC-test.id cidr_block = var.Subnet10-VPC-test map_public_ip_on_launch = true availability_zone = data.aws_availability_zones.AZ.names[0] tags = { Name = "Subnet10-VPC-test" } }

/================ default route table VPC-test =================/

resource "aws_default_route_table" "VPC-test-RT" { default_route_table_id = aws_vpc.VPC-test.default_route_table_id

lifecycle { ignore_changes = [route] # ignore any manually or ENI added routes }

route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.VPC-test-igw.id }

tags = { Name = "RT-VPC-test" } }

/================ Route Table association =================/

resource "aws_route_table_association" "VPC-test_10" { subnet_id = aws_subnet.Subnet10-VPC-test.id route_table_id = aws_default_route_table.VPC-test-RT.id }

/================ Security Groups =================/

resource "aws_security_group" "GC-SG-VPC-test" { revoke_rules_on_delete = true // lifecycle { // create_before_destroy = true // } name = "GC-SG-VPC-test" vpc_id = aws_vpc.VPC-test.id

tags = { Name = "GCTF-SG-VPC-test" }

SSH, all PING and others

ingress { description = "Allow SSH" from_port = 22 to_port = 22 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } ingress { description = "Allow all PING" from_port = -1 to_port = -1 protocol = "icmp" cidr_blocks = ["0.0.0.0/0"] } ingress { description = "Allow MySQL" from_port = 3306 to_port = 3306 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } ingress { description = "Allow iPERF3" from_port = 5201 to_port = 5201 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] }

}

resource "aws_default_security_group" "default" {

revoke_rules_on_delete = true depends_on = [aws_security_group.GC-SG-VPC-test] // lifecycle { // create_before_destroy = true // }

vpc_id = aws_vpc.VPC-test.id

ingress { description = "Default SG for VPC-test" from_port = 0 to_port = 0 protocol = "-1" self = true } ingress{ description = "Include EC2 SG in VPC-test default SG" from_port = 0 to_port = 0 protocol = "-1" security_groups = [aws_security_group.GC-SG-VPC-test.id] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = { Name = "Default VPC-test-SG" } }

/================ S3 Gateway end point =================/

resource "aws_vpc_endpoint" "s3" { vpc_id = aws_vpc.VPC-test.id service_name = "com.amazonaws.${var.region}.s3" route_table_ids = [aws_default_route_table.VPC-test-RT.id] }

`

gchek commented 4 years ago

updated to Terraform 0.13 terraform version Terraform v0.13.2 provider registry.terraform.io/hashicorp/aws v3.5.0

same issue - terraform says "default SG is destroyed" but it's not - see picture above

gdavison commented 3 years ago

Hi @gchek and @brett-patterson, thanks for reporting this.

Unlike most Terraform resources, the Delete operation on the aws_default_security_group is to simply remove it from Terraform state management. We document that in the Usage section of the resource documentation. Can you give us some feedback on the organization of the documentation page?

The resource is also supposed to output a log message stating Cannot destroy Default Security Group. Terraform will remove this resource from the state file, however resources may remain., but for some reason, the message is not being displayed.

The existence of the field revoke_rules_on_delete on aws_default_security_group is a holdover from when it shared the resource schema with aws_security_group. We separated the schema definitions in v3.2, but left the field since it would be a breaking change which we only make on major version changes. The field never did anything on aws_default_security_group so it was not included in documentation for the resource.

However, leaving the rules untouched is clearly causing errors when deleting resources that are linked to rules defined inline on the aws_default_security_group. We'll do some analysis on a good solution for this issue. We need to potentially leave the rules in place in situations where the user is simply removing the aws_default_security_group from Terraform management, but still allow deletion.

gchek commented 3 years ago

@gdavison - Thanks for looking into that. It's been a pain for a long time. As you can see in the output screen shot above, we really need to MANUALLY destroy the rule containing the SG so the destroy process can complete. Looking forward to your findings.