Can't destroy aws_batch_compute_environment associated with an aws_batch_job_queue: Cannot delete, found existing JobQueue relationship

dpedu commented 4 years ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

v0.12.12

Affected Resource(s)

aws_batch_compute_environment
aws_batch_job_queue

Terraform Configuration Files

provider "aws" {
  region  = "us-east-1"
  profile = "foobar"
  version = "~> 2.60"
}

variable "num_envs" {
  description = "number of compute environments"
  default     = 2
}

resource "aws_batch_compute_environment" "ondemand" {
  count = var.num_envs

  compute_environment_name_prefix = "TfCE"
  service_role                    = "..."
  state                           = "ENABLED"
  type                            = "MANAGED"

  compute_resources {
    allocation_strategy = "BEST_FIT"
    desired_vcpus       = 0
    ec2_key_pair        = "..."
    image_id            = "..."
    instance_role       = "..."
    instance_type = [
      "m4.large",
    ]
    max_vcpus          = 1000
    min_vcpus          = 0
    security_group_ids = ["..."]
    subnets            = ["...", "..."]
    tags = {
      "Name" = "foobar"
    }
    type = "EC2"
  }
  lifecycle {
    ignore_changes        = [compute_resources[0].desired_vcpus]
    create_before_destroy = true
  }
}

resource "aws_batch_job_queue" "demand_queue" {
  compute_environments = [for i in range(0, var.num_envs) : aws_batch_compute_environment.ondemand[i].arn]
  name                 = "TfJQ"
  priority             = 10
  state                = "ENABLED"
}

output "queue" {
  value = aws_batch_job_queue.demand_queue.arn
}

Debug Output

N/a

Expected Behavior

Terraform successfully deletes the aws_batch_compute_environment.

Actual Behavior

Terraform hits an error:

aws_batch_compute_environment.ondemand[1]: Destroying... [id=TfCEXXXXXXXX]

Error: error deleting Batch Compute Environment (TfCEXXXXXXXX): : Cannot delete, found existing JobQueue relationship

Steps to Reproduce

terraform apply - using the above code
terraform apply -var num_envs=1 - reduce the number of compute environments and apply again

Important Factoids

If I attempt to update the aws_batch_compute_environment - even if I do modifications that cause terraform to delete and recreate the resource (e.g. changing instance_type) - this works correctly. Terraform does this:

Creates the new resource
Modifies the relevant aws_batch_job_queue associating the the new environment and disassociating the old one
Deletes the old aws_batch_compute_environment

For deletes, terraform needs to do steps 2 and 3.

References

This sounds similar but is not the same as:

2044

justinretzolk commented 3 years ago

Hey @dpedu 👋 Thank you for taking the time to file this issue. Based on a related issue (#15512), it looks like this may have been solved in a later release. Can you confirm whether you're still experiencing this?

tadejsv commented 3 years ago

@justinretzolk This happened to me just now, latest version of terraform and aws provider

justinretzolk commented 3 years ago

Thank you for confirming @tadejsv! I'll get this marked as a bug so that we can take a look into it as soon as time allows.

rememberlenny commented 2 years ago

FWIW - I also had this problem in the AWS 3.70.0 and Terraform CLI 1.1.2. Solution for time being was to run a destroy and rerun setup plan.

rmarable commented 2 years ago

This is still an issue with terraform-1.2.3 and aws-cli/2.7.11.

enc-wmatern commented 1 year ago

This is still an issue with terraform-1.5.2 and provider hashicorp/aws 5.8.0

stanpalatnik commented 1 year ago

@justinretzolk Any updates on this?

justinretzolk commented 1 year ago

Hey @stanpalatnik 👋 Thank you for taking the time to check on this! This has been prioritized, so should be looked at by someone on the team this quarter 🎉

johnsonaj commented 1 year ago

Thank you for the update on this issue! Terraform itself is responsible for generating the graph that determines order of operations, and doesn’t currently have a way for providers to supply additional information regarding ordering. That said, you can control this to some degree with create_before_destroy (this issue in the Terraform Core repository has quite a bit more information that I found helpful when brushing up on this particular pattern).

I used the following configuration and was able to create and destroy aws_batch_compute_environment resources without getting an error:

terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
      version = "5.13.1"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

resource "aws_batch_compute_environment" "test" {
  count = 2 
  compute_environment_name_prefix = "testing"
  service_role             = aws_iam_role.test.arn
  type                     = "MANAGED"

  compute_resources {
    instance_role      = aws_iam_instance_profile.ecs_instance_role.arn
    instance_type      = ["c5", "m5", "r5"]
    max_vcpus          = 1
    min_vcpus          = 0
    security_group_ids = [aws_security_group.test.id]
    subnets            = [aws_subnet.test.id]
    type               = "EC2"
  }

  depends_on = [aws_iam_role_policy_attachment.test]

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_batch_job_queue" "test" {
  compute_environments = [for v in aws_batch_compute_environment.test: v.arn]

  name                 = "test-job-queue"
  priority             = 10
  state                = "ENABLED"

  tags = {
    key1 = "value12"
  }
}

Given the info above, I’ll close this one out. If you feel I’ve done this in error, please do let me know.

dpedu commented 1 year ago

The code in the original issue also uses create_before_destroy = true on the aws_batch_compute_environment.

I'm unable to re-test the original code 1:1, because it uses aws provider version 2.60 and that version does not support SSO which has since become a requirement in my environment. That being said, with the version specification changed to use either 3.76.1 or v5.14.0, I can no longer reproduce the issue likewise 🎉

I'm using terraform 1.5.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

hashicorp / terraform-provider-aws