Imported ECS Task Definition being destroyed with skip_destroy to true

katsew commented 2 years ago

Hi,

I use aws_ecs_task_definition with existing Task Definition(TD). I imported the TD, then I applied my changes, the revision of the imported TD was deactivated,
even if I set skip_destroy to true. After applying the first changes, it keeps the revision correctly. It seems that skip_destroy is not actually become true while applying the changes for the first time.

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

Terraform v1.1.6
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v4.14.0
+ provider registry.terraform.io/hashicorp/null v3.1.1

Affected Resource(s)

aws_ecs_task_definition

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

resource "aws_ecs_task_definition" "self" {
  family                   = "app"
  skip_destroy             = true
  container_definitions = jsonencode([{
    name              = "app"
    image             = var.image
    cpu               = 1024
    memoryReservation = 128
    essential         = true
    entryPoint        = ["/sbin/entrypoint.sh"]
  }])
}

Expected Behavior

terraform should always respect the value of skip_destroy.

Actual Behavior

terraform ignores the value of skip_destroy while applying the changes for the first time.

Steps to Reproduce

terraform state rm address.to.my.task_definition
terraform import address.to.my.task_definition task_def_arn
set skip_destroy to true
terraform apply

References

258
22269
23353
23295
30553

keshavixlayer commented 2 years ago

I am also facing the same.

skip_destroy = false

cb-dcc commented 2 years ago

Has there been any movement on this? Encountering this issue as well.

nevir commented 1 year ago

Also have run into this.

manugarri commented 1 year ago

still an issue, is there a workaround available?

manugarri commented 1 year ago

I think i might have found a workaround to this. I would love if someone (@nevir maybe?) tried to replicate this behavior to see if it actually works or is just some state issue im having.

My initial task was defined like this:

resource "aws_ecs_task_definition" "operator" {
  family = "IMAGE_NAME-${terraform.workspace}"
  task_role_arn = resource.aws_iam_role.ecs_task_role.arn
  execution_role_arn = ".../ecsTaskExecutionRole"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu       = 256
  memory    = 512
  container_definitions = jsonencode([
    {
      name      = "IMAGE_NAME"
      image     = "AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/IMAGE_NAME:IMAGE_VERSION"
      essential = true
    }
  ])
  # doesnt work
  skip_destroy = true
}

Updating the task (basically updating the image tag) made terraform deregister the older tasks.

Then I wanted to add log options, that I dont see defined in the resource. So I changed the container definition to json (so no jsonencode and added the logstream.

Now the task looks like this:

resource "aws_ecs_task_definition" "operator" {
  family = "${var.operator_name}-${terraform.workspace}"
  task_role_arn = resource.aws_iam_role.ecs_task_role.arn
  execution_role_arn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/ecsTaskExecutionRole"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu       = 256
  memory    = 512
  container_definitions = <<TASK_DEFINITION
[
  {
      "name": "IMAGE_NAME",
      image     = "AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/IMAGE_NAME:IMAGE_VERSION"
      "cpu": 0,
      "essential": true,
      "logConfiguration": {
          "logDriver": "awslogs",
          "options": {
              "awslogs-group": "taskgroup",
              "awslogs-region": "us-west-2",
              "awslogs-stream-prefix": "/aws/ecs"
          },
          "secretOptions": []
      }
        }
]
TASK_DEFINITION
  # maybe it works somehow?
  skip_destroy = true

}

And lo an behold, the task versions do not get destroyed anymore!

lockwobr commented 1 year ago

I have struggled with this as well, this (skip_destroy) feature seems to work in some cases, maybe most tbh, and have for sure seen where it says it's going to destroy, and it does in fact destroy it, which from what I can tell in my testing it seems related to import. Because of this some times working and not, and the plan says it's going to do it, which really leads to confusion, which seems related to this. I far as I can tell the imported task will get deregister on first time a change is applied and not after that. I think the issue is because when you import the resource skip_destroy is not a ECS Task thing, so it never gets into the statefile, and therefore when you apply your first change via terraform this imported task is deregister. I verified this by looking at the statefile of a task that was just imported and on that has had a change applied and noticed this setting is missing.

terraform state show 'target name'
...
    revision                 = 14
    skip_destroy             = true
    tags                     = {}
...

vs

terraform state show 'different target name'
...
    revision                 = 10
    tags                     = {}
...

Also when debugging you can track this by looking for Retaining ECS Task in the debug logs. You will never see this log on the first change after an import, but on the second and subsequent you will see this.

TF_LOG=debug terraform apply 
...
2023-02-28T17:30:45.394-0800 [DEBUG] provider.terraform-provider-aws_v4.56.0_x5: [DEBUG] Retaining ECS Task Definition Revision "tasks name redacted"
...

lockwobr commented 1 year ago

I was trying to come up with a workaround, but it seems like skip_destroy will not trigger a change whether it's true or false after an import. It seemed like you could add a tag to trigger the update, and not a destroy, at which port the statefile will be updated. However, that also does not trigger either which is very odd too IMHO. The only workaround might be to manually update the statefile with skip_destroy = true via terraform state push which in my mind is not a workaround.

Seem to always get:

No changes. Your infrastructure matches the configuration.

lockwobr commented 1 year ago

So I ended up having to do my workaround, where you pull the statefile (terraform state pull > state.json), find and replace "skip_destroy": null, with "skip_destroy": true, bump serial up 1 (part of how locking works), and push (terraform state push state.json). It did the trick, sharing for others who come across this, or for if someone wants to fix this. Seems difficult to fix, and is a fundamental issue with how import works if I had to guess.

manugarri commented 1 year ago

@lockwobr have you read my comment above? I have been running it for some time now, and the historical ecs versions are kept without having to do any funky state business.

lockwobr commented 1 year ago

@manugarri I did try your workaround and it doesn't work. However, it did help me figure out the issue. It will always deregister the task you import and not any subsequent updates tasks. I am pretty sure it's because import only imports ecs attributes and this one is not so it never makes it into the statefile.

manugarri commented 1 year ago

oh really? which terraform version and provider are you using? because if the reason is the state is missing, im not sure how changing from a struct to a json string would make it work.

hankwallace commented 12 months ago

we ran into this today as well. the previous plan/apply set skip_destroy to true and was applied in-place, but the next plan/apply which created a new task definition marked the previous one as inactive. this is a big problem for us because our process for reverting a bad deploy is to just redeploy the previous task definition.

plain5 commented 2 months ago

Faced the same issue. Still relevant.

hashicorp / terraform-provider-aws