hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.73k stars 9.09k forks source link

Terraform cannot destroy VPC Links to API Gateway because of a dependency issue #12195

Open christhomas opened 4 years ago

christhomas commented 4 years ago

I'm trying to destroy an API Gateway application which has a VPC link and it's constantly and consistently failing to do so and I'm unable to automate this in a way which consistently functions. It appears that terraform doesn't know how to destroy these resources.

Terraform version: v0.12.8 AWS Provider: v2.48.0

The error that I'm getting from the Terraform CLI is:

module.service.module.ecs_passthrough.aws_api_gateway_vpc_link.vpc_link: Refreshing state... [id=ox6a0g]
module.service.module.ecs_passthrough.aws_api_gateway_vpc_link.vpc_link: Destroying... [id=ox6a0g]

Error: error deleting API Gateway VPC Link (ox6a0g): BadRequestException: Cannot delete vpc link. Vpc link 'ox6a0g', is referenced in [ANY:kx106x:dev] in format of [Method:Resource:Stage].

I don't think I'm doing anything out of the ordinary in terms of the API Gateway, but the VPC parts of it look like this

resource "aws_api_gateway_vpc_link" "vpc_link" {
  name = "${var.name}_vpc_link"
  target_arns = ["${var.load_balancer.arn}"]
}

resource "aws_api_gateway_integration" "request_method_integration" {

  rest_api_id = "${var.api_id}"
  resource_id = "${var.resource_id}"
  http_method = "${aws_api_gateway_method.request_method.http_method}"

  integration_http_method = "${aws_api_gateway_method.request_method.http_method}"

  type = "${var.invoke_type}"
  uri = "http://${var.load_balancer.dns_name}/{proxy}/"
  credentials = "${var.gateway_credentials}"

  request_parameters = "${var.integration_request_params}"

  connection_type = "VPC_LINK"
  connection_id   = "${aws_api_gateway_vpc_link.vpc_link.id}"
}

I don't know what I can do since if I can't bring up and reliably pull down an application, then I'm stuck, maybe I can force the order of terraform resources to make it work in the way that will enable this to start working again?

dtelaroli commented 4 years ago

+1

christhomas commented 4 years ago

It appears the problem is that API Gateways are "compiled" and deployed instances of what the console is representing. This is a problem that I experienced in the past because you would change the API Gateway using the console and the API Gateway wouldn't respond to the updated changes until you "deploy api" and this takes your definition in the console, builds a distribution and pushes it to cloudfront.

If you look in the cloudfront console, there is no distribution since API Gateway cloudfront distributions are hidden and only accessible through API Gateways Console interface.

So the problem is, that terraform wants to delete/recreate/change a VPC Link, so it tries to do that, but there is a connection to this "object" in the cloudfront distribution and this blocks modification of the VPC Link.

So even if you delete the resources in API Gateways interface, but don't deploy it, you still can't modify this VPC Link, because the link is being referenced in the compiled, distributed version that is installed on cloudfront.

If you remove the methods and integrations from your API Gateway and remove the reference to the VPC link, then deploy api, THEN you can delete the vpc link.

Terraform apparently doesn't know how to do this, but there are no mechanisms that I'm aware of other than scripting to make a sequence of actions that will do the following

  1. Remove the integration from API Gateway
  2. Deploy the api (breaking the online distribution temporarily)
  3. Delete/Recreate/Edit the VPC Link according to the terraform definition
  4. Create any methods/integrations necessary, relinking the VPC Link to the API Gateway
  5. Deploying the new api

This sequence of actions in theory would work. However I can't see any way to do them using terraform primatives. Anybody has any ideas how you'd do this in pure terraform.tf files?

ghost commented 4 years ago

Unfortunately no ideas for you Chris. This problem does effectively make it impossible to automate the changes to API Gateway if it contains VPC links.

christhomas commented 4 years ago

I’m wondering if you could cleverly use null_resources, targeting a script that always deploys the api, and a clever use of target and depends_on to trigger a series of two deployments. First, when the link changes therefore severing the link to the rest api and the second execution to deploy the changed resources with the new link.

Do you understand what I mean? On 14. Jul 2020, 17:34 +0200, Steven Massaro notifications@github.com, wrote:

Unfortunately no ideas for you Chris. This problem does effectively make it impossible to automate the changes to API Gateway if it contains VPC links. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ghost commented 4 years ago

I’m wondering if you could cleverly use null_resources, targeting a script that always deploys the api, and a clever use of target and depends_on to trigger a series of two deployments. First, when the link changes therefore severing the link to the rest api and the second execution to deploy the changed resources with the new link. Do you understand what I mean? On 14. Jul 2020, 17:34 +0200, Steven Massaro @.***>, wrote: Unfortunately no ideas for you Chris. This problem does effectively make it impossible to automate the changes to API Gateway if it contains VPC links. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Unfortunately I don't Chris. I'm quite new to Terraform. Nevertheless, this is (magically, frustratingly) no longer an issue for me.

christhomas commented 4 years ago

Oh you’re new to this? First time? (Movie quote bonus! +1)

Don’t worry, the problem will disappear and then come back, at least that’s what it does with me :) On 17. Jul 2020, 13:53 +0200, Steven Massaro notifications@github.com, wrote:

I’m wondering if you could cleverly use null_resources, targeting a script that always deploys the api, and a clever use of target and depends_on to trigger a series of two deployments. First, when the link changes therefore severing the link to the rest api and the second execution to deploy the changed resources with the new link. Do you understand what I mean? … On 14. Jul 2020, 17:34 +0200, Steven Massaro @.***>, wrote: Unfortunately no ideas for you Chris. This problem does effectively make it impossible to automate the changes to API Gateway if it contains VPC links. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Unfortunately I don't Chris. I'm quite new to Terraform. Nevertheless, this is (magically, frustratingly) no longer an issue for me. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

miguelgmalpha commented 3 years ago

It appears the problem is that API Gateways are "compiled" and deployed instances of what the console is representing. This is a problem that I experienced in the past because you would change the API Gateway using the console and the API Gateway wouldn't respond to the updated changes until you "deploy api" and this takes your definition in the console, builds a distribution and pushes it to cloudfront.

I think this is the real problem here. I've been doing some testing and found that the crossed dependencies make nearly imposible to change parameters on resources that trigger a recreation of that resource.

While trying to find a way to past the VPC_LINK problem I tried to change the integration_type that is refering that VPC_LINK to MOCK. This change is triggering a recreation of the integration object but is failing because it's a required dependency for the route that is using it.

Error: error deleting API Gateway v2 VPC Link (l4er5c): BadRequestException: Cannot delete vpc link. Vpc link 'l4er5c', is referenced in [ANY:qtfgl2b] in format of [Method:Resource].

Error: error deleting API Gateway v2 integration: ConflictException: Cannot delete Integration because it is referenced by the following Routes with Ids: [qtfgl2b]

I'm trying to play with the depends_on meta parameter from terraform but without luck so far :(

EDIT: To make things even worse, you cannot define two different vpc-links to the same subnets even with different security-groups. This is an AWS API limitation:

aws apigatewayv2 create-vpc-link --name mgm-test --subnet-ids subnet-aaaaaaa subnet-bbbbbbb subnet-ccccccc  --security-group-ids sg-aaaaaaaaaaaaaa sg-bbbbbbbbbbbbb

An error occurred (InternalServerException) when calling the CreateVpcLink operation (reached max retries: 2): Internal server error

As the VPC-LINK is creating ENIs attached to those subnets seems that it can only attach one per subnet.

So you cannot create a new one with the configuration you need and then switch the integration to the new one so you don't have downtime.

yli186 commented 3 years ago

I have faced the issue and was able to create second VPC link to do the flip to avoid downtime. Not sure if something from AWS end has changed but this worked for me.

dacahill7 commented 2 years ago

Figured out a solution! The aws_api_gateway_deployment resource documentation has a note talking about adding a create_before_destroy lifecycle block. This allows the VPC Link to connect to the new API Gateway Deployment first before destroying the old API Gateway Deployment, so at all times the VPC Link is attached and prevents the downtime.

lifecycle { create_before_destroy = true }

justinretzolk commented 2 years ago

Hi all 👋 Thank you for taking the time to file this issue, and for the continued discussion! It looks like the documentation link provided above should sort this issue out. Can anyone else who was previously experiencing this behavior confirm that the lifecycle block resolves the issue for you as well?

edobrb commented 2 years ago

Hi, I have the same problem and to me it's look like a dependency issue.

My configuration for the API Gateway is something like this:

resource "aws_vpc" "default" { //this was changed
  cidr_block = "10.0.0.0/16"
}

//... subnets definitions

data "aws_subnet_ids" "all" {
  vpc_id = aws_vpc.default.id
}

resource "aws_apigatewayv2_vpc_link" "default" {
  name               = "test"
  security_group_ids = [module.alb.security_group_id]
  subnet_ids         = data.aws_subnet_ids.all.ids
}

resource "aws_apigatewayv2_api" "default" {
  name          = "test"
  protocol_type = "HTTP"
}

resource "aws_apigatewayv2_integration" "default" {
  api_id             = aws_apigatewayv2_api.default.id
  integration_type   = "HTTP_PROXY"
  integration_uri    = aws_lb_listener.alb_80.arn
  integration_method = "ANY"
  connection_type    = "VPC_LINK"
  connection_id      = aws_apigatewayv2_vpc_link.default.id
}

I've changed the VPC and applied the modifications, I would expect aws_apigatewayv2_integration to be destroyed as well as it depends on aws_apigatewayv2_vpc_link that depends on aws_vpc but instead I get this error: error deleting API Gateway v2 VPC Link (xxxxxx): BadRequestException: Cannot delete vpc link. Vpc link 'xxxxxx', is referenced in [ANY:yyyyyyy] in format of [Method:Resource].

Thanks in advance.

steczu commented 2 years ago

Hi,

I also stumbled across this error. I tried the suggestion to use the lifecycle hook "create_before_destory", but unfortunately without success. After some research, I came across a note in the AWS doc (https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-known-issues.html) which says that the VPC link should not be referenced directly, but instead a stage variable should be used as a reference. I set this up and can now reliably delete VPC links.

For an API Resource or Method entity with a private integration, you should delete it after removing any hard-coded reference of a VpcLink. Otherwise, you have a dangling integration and receive an error stating that the VPC link is still in use even when the Resource or Method entity is deleted. This behavior does not apply when the private integration references VpcLink through a stage variable.

resource "aws_api_gateway_deployment" "example" {
  rest_api_id = aws_api_gateway_rest_api.example.id
  stage_name  = var.env

  variables = {
    "vpc_link_id" = aws_api_gateway_vpc_link.example.id
  }

  depends_on = [
    aws_api_gateway_vpc_link.example,
    // other resources
  ]

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_api_gateway_vpc_link" "example" {
  name        = "${var.project}-example-${var.env}"
  target_arns = [var.loadbalancer_arn]
}

resource "aws_api_gateway_integration" "example" {
  integration_http_method = "ANY"
  type                    = "HTTP_PROXY"
  connection_type         = "VPC_LINK"
  connection_id           = "$${stageVariables.vpc_link_id}" //weird, but correct syntax!
  // ...
}
akierstein-insider commented 2 years ago

Just wanted to chime in that after re-creating my aws_api_gateway_integrations with @StefanCzubek's stagevariable suggestion above, I was able to remove the aws_api_gateway_vpc_link successfully.

Great find @StefanCzubek !!

jovestefanovski commented 1 year ago

I had a dependency by the API Gateway Integration and this was my terraform message:

module.api_gateway.aws_apigatewayv2_vpc_link.http_api_vpc_link: Destroying... [id=abcd0f]

Error: error deleting API Gateway v2 VPC Link (abcd0f): BadRequestException: Cannot delete vpc link. Vpc link 'abcd0f', is referenced in [ANY:vwx8yz] in format of [Method:Resource].

In my case neither @steczu suggestion was working. What I did was just manually from the console Detach the integration [ANY:vwx8yz] and rereun terraform apply. From here everything regarding the deployment went well.

baromojm commented 1 year ago

Figured out a solution! The aws_api_gateway_deployment resource documentation has a note talking about adding a create_before_destroy lifecycle block. This allows the VPC Link to connect to the new API Gateway Deployment first before destroying the old API Gateway Deployment, so at all times the VPC Link is attached and prevents the downtime.

lifecycle { create_before_destroy = true }

The solution provided by dacahill7 has worked out fine for me. I added: lifecycle { create_before_destroy = true }

to resource: "aws_api_gateway_deployment"

mvn-bachhuynh-dn commented 10 months ago

Hi,

I also stumbled across this error. I tried the suggestion to use the lifecycle hook "create_before_destory", but unfortunately without success. After some research, I came across a note in the AWS doc (https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-known-issues.html) which says that the VPC link should not be referenced directly, but instead a stage variable should be used as a reference. I set this up and can now reliably delete VPC links.

For an API Resource or Method entity with a private integration, you should delete it after removing any hard-coded reference of a VpcLink. Otherwise, you have a dangling integration and receive an error stating that the VPC link is still in use even when the Resource or Method entity is deleted. This behavior does not apply when the private integration references VpcLink through a stage variable.

resource "aws_api_gateway_deployment" "example" {
  rest_api_id = aws_api_gateway_rest_api.example.id
  stage_name  = var.env

  variables = {
    "vpc_link_id" = aws_api_gateway_vpc_link.example.id
  }

  depends_on = [
    aws_api_gateway_vpc_link.example,
    // other resources
  ]

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_api_gateway_vpc_link" "example" {
  name        = "${var.project}-example-${var.env}"
  target_arns = [var.loadbalancer_arn]
}

resource "aws_api_gateway_integration" "example" {
  integration_http_method = "ANY"
  type                    = "HTTP_PROXY"
  connection_type         = "VPC_LINK"
  connection_id           = "$${stageVariables.vpc_link_id}" //weird, but correct syntax!
  // ...
}

It works like a charm! Thank you so much!

a-macgillivray-fnc commented 6 days ago

Hi,

I also stumbled across this error. I tried the suggestion to use the lifecycle hook "create_before_destory", but unfortunately without success. After some research, I came across a note in the AWS doc (https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-known-issues.html) which says that the VPC link should not be referenced directly, but instead a stage variable should be used as a reference. I set this up and can now reliably delete VPC links.

For an API Resource or Method entity with a private integration, you should delete it after removing any hard-coded reference of a VpcLink. Otherwise, you have a dangling integration and receive an error stating that the VPC link is still in use even when the Resource or Method entity is deleted. This behavior does not apply when the private integration references VpcLink through a stage variable.

resource "aws_api_gateway_deployment" "example" {
  rest_api_id = aws_api_gateway_rest_api.example.id
  stage_name  = var.env

  variables = {
    "vpc_link_id" = aws_api_gateway_vpc_link.example.id
  }

  depends_on = [
    aws_api_gateway_vpc_link.example,
    // other resources
  ]

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_api_gateway_vpc_link" "example" {
  name        = "${var.project}-example-${var.env}"
  target_arns = [var.loadbalancer_arn]
}

resource "aws_api_gateway_integration" "example" {
  integration_http_method = "ANY"
  type                    = "HTTP_PROXY"
  connection_type         = "VPC_LINK"
  connection_id           = "$${stageVariables.vpc_link_id}" //weird, but correct syntax!
  // ...
}

This wasn't super convenient for me due to module layout stuff, but I found that placing a 2nd

lifecycle {
   create_before_destroy = true
}

to the vpc link fixed things for me.

Hope this is helpful for anybody else who finds this thread in future :)