hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.81k stars 9.16k forks source link

Can't delete routes cleanly when same route exists via different targets #26804

Open bodgit opened 2 years ago

bodgit commented 2 years ago

Community Note

Terraform CLI and Terraform AWS Provider Version

Terraform v1.1.9
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v4.17.1

Affected Resource(s)

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

resource "aws_vpc" "example" {
  cidr_block = "172.16.0.0/16"
}

resource "aws_subnet" "example" {
  vpc_id     = aws_vpc.example.id
  cidr_block = "172.16.0.0/24"
}

resource "aws_vpn_gateway" "example" {
  vpc_id = aws_vpc.example.id
}

resource "aws_route_table" "example" {
  vpc_id = aws_vpc.example.id
}

resource "aws_vpn_gateway_route_propagation" "example" { # Assume this causes a 10.0.0.0/8 route to be propagated to the route table
  route_table_id = aws_route_table.example.id
  vpn_gateway_id = aws_vpn_gateway.example.id
}

resource "aws_ec2_transit_gateway_vpc_attachment" "example" {
  subnet_ids         = [aws_subnet.example.id]
  transit_gateway_id = "tgw-deadbeef"
  vpc_id             = aws_vpc.example.id
}

resource "aws_route" "example" { # Create an explicit 10.0.0.0/8 route via the Transit Gateway
  route_table_id         = aws_route_table.example.id
  destination_cidr_block = "10.0.0.0/8"
  transit_gateway_id     = "tgw-deadbeef"
}

Expected Behavior

Route should be removed and Terraform exits cleanly.

Actual Behavior

The route is removed, however Terraform times out waiting for the route to disappear because the waiting code finds the propagated route. The following error is logged:

Error: error waiting for Route in Route Table (rtb-c0ffee) with destination (10.0.0.0/8) to delete: timeout while waiting for resource to be gone (last state: 'ready', timeout: 5m0s)

Trying to apply again results in Terraform potentially trying to delete the propagated route.

When Terraform is reading/querying for routes, it doesn't seem to include the target type of the route in the search criteria, i.e. transit_gateway_id etc. so if you have the same route available via multiple targets you get into trouble.

Steps to Reproduce

  1. terraform apply
  2. Remove the aws_route resource from the configuration
  3. terraform apply

Important Factoids

The example scenario is a bit complex to reproduce. I have multiple accounts that currently each have a VPN gateway with dedicated Direct Connect private interfaces for connectivity and I'm in the process of changing over to using a Transit Gateway instead that each account is connected to. So for a period of time whilst both types of connectivity is up, I potentially have the same routes available via both a vgw-XXXXX resource and a tgw-XXXXX resource. Eventually the VPN gateway will be destroyed and so there'll only be one of each route again.

gurpalw commented 1 year ago

replicated on hashicorp/aws v5.13.1 when trying to remove and recreate routes to the same destination cidr with a new peering connection.

Error: waiting for Route in Route Table (rtb-xxxxxxxxxx) with destination (10.80.0.0/16) delete: timeout while waiting for resource to be gone (last state: 'ready', timeout: 5m0s)

The above error appears even though i have confirmed via the aws console that the route is actually deleted and has been replaced by the new route.. Running terraform plan again shows the below:

 # aws_route.development["rtb-xxxxxxxxxx"] will be destroyed
  # (because aws_route.development is not in configuration)
  - resource "aws_route" "development" {
      - destination_cidr_block    = "10.80.0.0/16" -> null
      - id                        = "r-rtb-xxxxxxxxxx" -> null
      - origin                    = "CreateRoute" -> null
      - route_table_id            = "rtb-xxxxxxxxxx" -> null
      - state                     = "active" -> null
      - vpc_peering_connection_id = "pcx-xxxxxxxxxx" -> null
    }

running the apply will delete the newly created routes. and running the plan again will show the routes re-appear in the plan to be created.