hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.83k stars 9.18k forks source link

Can't destroy aws_batch_compute_environment associated with an aws_batch_job_queue: Cannot delete, found existing JobQueue relationship #13221

Closed dpedu closed 1 year ago

dpedu commented 4 years ago

Community Note

Terraform Version

v0.12.12

Affected Resource(s)

Terraform Configuration Files

provider "aws" {
  region  = "us-east-1"
  profile = "foobar"
  version = "~> 2.60"
}

variable "num_envs" {
  description = "number of compute environments"
  default     = 2
}

resource "aws_batch_compute_environment" "ondemand" {
  count = var.num_envs

  compute_environment_name_prefix = "TfCE"
  service_role                    = "..."
  state                           = "ENABLED"
  type                            = "MANAGED"

  compute_resources {
    allocation_strategy = "BEST_FIT"
    desired_vcpus       = 0
    ec2_key_pair        = "..."
    image_id            = "..."
    instance_role       = "..."
    instance_type = [
      "m4.large",
    ]
    max_vcpus          = 1000
    min_vcpus          = 0
    security_group_ids = ["..."]
    subnets            = ["...", "..."]
    tags = {
      "Name" = "foobar"
    }
    type = "EC2"
  }
  lifecycle {
    ignore_changes        = [compute_resources[0].desired_vcpus]
    create_before_destroy = true
  }
}

resource "aws_batch_job_queue" "demand_queue" {
  compute_environments = [for i in range(0, var.num_envs) : aws_batch_compute_environment.ondemand[i].arn]
  name                 = "TfJQ"
  priority             = 10
  state                = "ENABLED"
}

output "queue" {
  value = aws_batch_job_queue.demand_queue.arn
}

Debug Output

N/a

Expected Behavior

Terraform successfully deletes the aws_batch_compute_environment.

Actual Behavior

Terraform hits an error:

aws_batch_compute_environment.ondemand[1]: Destroying... [id=TfCEXXXXXXXX]

Error: error deleting Batch Compute Environment (TfCEXXXXXXXX): : Cannot delete, found existing JobQueue relationship

Steps to Reproduce

  1. terraform apply - using the above code
  2. terraform apply -var num_envs=1 - reduce the number of compute environments and apply again

Important Factoids

If I attempt to update the aws_batch_compute_environment - even if I do modifications that cause terraform to delete and recreate the resource (e.g. changing instance_type) - this works correctly. Terraform does this:

  1. Creates the new resource
  2. Modifies the relevant aws_batch_job_queue associating the the new environment and disassociating the old one
  3. Deletes the old aws_batch_compute_environment

For deletes, terraform needs to do steps 2 and 3.

References

This sounds similar but is not the same as:

justinretzolk commented 3 years ago

Hey @dpedu πŸ‘‹ Thank you for taking the time to file this issue. Based on a related issue (#15512), it looks like this may have been solved in a later release. Can you confirm whether you're still experiencing this?

tadejsv commented 3 years ago

@justinretzolk This happened to me just now, latest version of terraform and aws provider

justinretzolk commented 3 years ago

Thank you for confirming @tadejsv! I'll get this marked as a bug so that we can take a look into it as soon as time allows.

rememberlenny commented 2 years ago

FWIW - I also had this problem in the AWS 3.70.0 and Terraform CLI 1.1.2. Solution for time being was to run a destroy and rerun setup plan.

rmarable commented 2 years ago

This is still an issue with terraform-1.2.3 and aws-cli/2.7.11.

enc-wmatern commented 1 year ago

This is still an issue with terraform-1.5.2 and provider hashicorp/aws 5.8.0

stanpalatnik commented 1 year ago

@justinretzolk Any updates on this?

justinretzolk commented 1 year ago

Hey @stanpalatnik πŸ‘‹ Thank you for taking the time to check on this! This has been prioritized, so should be looked at by someone on the team this quarter πŸŽ‰

johnsonaj commented 1 year ago

Thank you for the update on this issue! Terraform itself is responsible for generating the graph that determines order of operations, and doesn’t currently have a way for providers to supply additional information regarding ordering. That said, you can control this to some degree with create_before_destroy (this issue in the Terraform Core repository has quite a bit more information that I found helpful when brushing up on this particular pattern).

I used the following configuration and was able to create and destroy aws_batch_compute_environment resources without getting an error:

terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
      version = "5.13.1"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

resource "aws_batch_compute_environment" "test" {
  count = 2 
  compute_environment_name_prefix = "testing"
  service_role             = aws_iam_role.test.arn
  type                     = "MANAGED"

  compute_resources {
    instance_role      = aws_iam_instance_profile.ecs_instance_role.arn
    instance_type      = ["c5", "m5", "r5"]
    max_vcpus          = 1
    min_vcpus          = 0
    security_group_ids = [aws_security_group.test.id]
    subnets            = [aws_subnet.test.id]
    type               = "EC2"
  }

  depends_on = [aws_iam_role_policy_attachment.test]

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_batch_job_queue" "test" {
  compute_environments = [for v in aws_batch_compute_environment.test: v.arn]

  name                 = "test-job-queue"
  priority             = 10
  state                = "ENABLED"

  tags = {
    key1 = "value12"
  }
}

Given the info above, I’ll close this one out. If you feel I’ve done this in error, please do let me know.

dpedu commented 1 year ago

The code in the original issue also uses create_before_destroy = true on the aws_batch_compute_environment.

I'm unable to re-test the original code 1:1, because it uses aws provider version 2.60 and that version does not support SSO which has since become a requirement in my environment. That being said, with the version specification changed to use either 3.76.1 or v5.14.0, I can no longer reproduce the issue likewise πŸŽ‰

I'm using terraform 1.5.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.