hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.75k stars 9.11k forks source link

Terraform forcing replacement of ECS task definition - (only on task definitions with mount points) #11526

Open ghost opened 4 years ago

ghost commented 4 years ago

This issue was originally opened by @im-lcoupe as hashicorp/terraform#23780. It was migrated here as a result of the provider split. The original body of the issue is below.


Summary

Hi there,

So this only seems to have become a problem since upgrading my code to the latest version - and strangely only seems to happen on the task definitions with mount points (however, the format of them hasn't changed...)

Terraform will constantly try and replace the two task definitions regardless of whether any changes have been made to them...

Any guidance on this would be greatly appreciated, as it means the task definition revision is changing on every run (which of course is not ideal)...

Terraform Version

0.12.18

Terraform Configuration Files - 1st problematic task definition - followed by the second

[
  {
    "name": "${container_name_nginx}",
    "image": "${container_image_nginx}",
    "memory": ${container_memory},
    "cpu": ${container_cpu},
    "networkMode": "awsvpc",
    "volumesFrom": [],
    "essential": true,
    "portMappings": [
      {
        "containerPort": 443,
        "hostPort": 443,
        "protocol": "tcp"
      }
    ],
    "mountPoints": [
     {
    "readOnly": false,
    "containerPath": "/var/www/symfony/var/log",
    "sourceVolume": "shared_symfony_logs"
     },
     {
     "readOnly": false,
     "containerPath": "/var/log/nginx",
     "sourceVolume": "shared_nginx_logs"
     }
    ],
    "environment": [
      {
        "name": "${env_var_1_name}",
        "value": "${env_var_value_1}"
      },
      {
        "name": "${env_var_2_name}",
        "value": "${env_var_value_2}"
      },
      {
        "name": "${env_var_3_name}",
        "value": "${env_var_value_3}"
      }
    ],
  "logConfiguration" : {
    "logDriver" : "awslogs",
    "options" :{
      "awslogs-create-group": "true",
      "awslogs-group": "${container_name_nginx}",
      "awslogs-region": "${platform_region}",
      "awslogs-stream-prefix": "ecs"
    }
  }
},

{
  "name": "${container_name_php}",
  "image": "${container_image_php}",
  "memory": ${container_memory},
  "cpu": ${container_cpu},
  "networkMode": "awsvpc",
  "volumesFrom": [],
  "essential": true,
  "portMappings": [
    {
      "containerPort": 9000,
      "hostPort": 9000,
      "protocol": "tcp"
    }
  ],
  "mountPoints": [
   {
  "readOnly": false,
  "containerPath": "/var/www/symfony/var/log",
  "sourceVolume": "shared_symfony_logs"
   },
   {
   "readOnly": false,
   "containerPath": "/var/log/nginx",
   "sourceVolume": "shared_nginx_logs"
   }
  ],
  "environment": [
    {
      "name": "${env_var_4_name}",
      "value": "${env_var_value_4}"
    },
    {
      "name": "${env_var_5_name}",
      "value": "${env_var_value_5}"
    },
    {
      "name": "${env_var_6_name}",
      "value": "${env_var_value_6}"
    },
    {
      "name": "${env_var_7_name}",
      "value": "${env_var_value_7}"
    },
    {
      "name": "${env_var_8_name}",
      "value": "${env_var_value_8}"
    },
    {
      "name": "${env_var_10_name}",
      "value": "${env_var_value_10}"
    },
    {
      "name": "${env_var_11_name}",
      "value": "${env_var_value_11}"
    },
    {
      "name": "${env_var_12_name}",
      "value": "${env_var_value_12}"
    },
    {
      "name": "${env_var_13_name}",
      "value": "${env_var_value_13}"
    },
    {
      "name": "${env_var_14_name}",
      "value": "${env_var_value_14}"
    },
    {
      "name": "${env_var_15_name}",
      "value": "${env_var_value_15}"
    },
    {
      "name": "${env_var_16_name}",
      "value": "${env_var_value_16}"
    },
    {
      "name": "${env_var_17_name}",
      "value": "${env_var_value_17}"
    },
    {
      "name": "${env_var_18_name}",
      "value": "${env_var_value_18}"
    },
    {
      "name": "${env_var_19_name}",
      "value": "${env_var_value_19}"
    },
    {
      "name": "${env_var_20_name}",
      "value": "${env_var_value_20}"
    },
    {
      "name": "${env_var_21_name}",
      "value": "${env_var_value_21}"
    },
    {
      "name": "${env_var_22_name}",
      "value": "${env_var_value_22}"
    },
    {
      "name": "${env_var_1_name}",
      "value": "${env_var_value_1}"
    },
    {
      "name": "${env_var_28_name}",
      "value": "${env_var_value_28}"
    },
    {
      "name": "${env_var_29_name}",
      "value": "${env_var_value_29}"
    },
    {
      "name": "${env_var_30_name}",
      "value": "${env_var_value_30}"
    },
    {
      "name": "${env_var_31_name}",
      "value": "${env_var_value_31}"
    },
    {
      "name": "${env_var_32_name}",
      "value": "${env_var_value_32}"
    },
    {
      "name": "${env_var_33_name}",
      "value": "${env_var_value_33}"
    },
    {
      "name": "${env_var_34_name}",
      "value": "${env_var_value_34}"
    },
    {
      "name": "${env_var_35_name}",
      "value": "${env_var_value_35}"
    },
    {
      "name": "${env_var_36_name}",
      "value": "${env_var_value_36}"
    },
    {
      "name": "${env_var_37_name}",
      "value": "${env_var_value_37}"
    },
    {
      "name": "${env_var_38_name}",
      "value": "${env_var_value_38}"
    }
  ],
  "secrets":[
    {
      "name":"${sensitive_var_1}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_1}"
    },
    {
      "name":"${sensitive_var_2}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_2}"
    },
    {
      "name":"${sensitive_var_3}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_3}"
    },
    {
      "name":"${sensitive_var_4}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_4}"
    },
    {
      "name":"${sensitive_var_5}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_5}"
    },
    {
      "name":"${sensitive_var_6}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_6}"
    },
    {
      "name":"${sensitive_var_7}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_7}"
    },
    {
      "name":"${sensitive_var_8}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_8}"
    },
    {
      "name":"${sensitive_var_9}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_9}"
    },
    {
      "name":"${sensitive_var_10}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_10}"
    },
    {
      "name":"${sensitive_var_11}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_11}"
    },
    {
      "name":"${sensitive_var_12}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_12}"
    },
    {
      "name":"${sensitive_var_13}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_13}"
    }
  ],
"logConfiguration" : {
  "logDriver" : "awslogs",
  "options" :{
    "awslogs-create-group": "true",
    "awslogs-group": "${container_name_php}",
    "awslogs-region": "${platform_region}",
    "awslogs-stream-prefix": "ecs"
  }
}
},

{
  "name": "${container_name_logstash}",
  "image": "${container_image_logstash}",
  "memory": ${container_memory},
  "cpu": ${container_cpu},
  "networkMode": "awsvpc",
  "volumesFrom": [],
  "essential": true,
  "portMappings": [
    {
      "containerPort": 9600,
      "hostPort": 9600,
      "protocol": "tcp"
    }
  ],
  "mountPoints": [
   {
  "readOnly": false,
  "containerPath": "/var/www/symfony/var/log",
  "sourceVolume": "shared_symfony_logs"
   },
   {
   "readOnly": false,
   "containerPath": "/var/log/nginx",
   "sourceVolume": "shared_nginx_logs"
   }
 ],
 "environment": [
   {
     "name": "${env_var_23_name}",
     "value": "${env_var_value_23}"
   },
   {
     "name": "${env_var_1_name}",
     "value": "${env_var_value_1}"
   }
 ],
"logConfiguration" : {
  "logDriver" : "awslogs",
  "options" :{
    "awslogs-create-group": "true",
    "awslogs-group": "${container_name_logstash}",
    "awslogs-region": "${platform_region}",
    "awslogs-stream-prefix": "ecs"
    }
  }
}
]

Second Task definition

[
{
  "name": "${container_name_php}",
  "image": "${container_image_php}",
  "memory": ${container_memory},
  "cpu": ${container_cpu},
  "networkMode": "awsvpc",
  "volumesFrom": [],
  "essential": true,
  "command": ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"],
  "portMappings": [
    {
      "containerPort": 9001,
      "hostPort": 9001,
      "protocol": "tcp"
    }
  ],
  "mountPoints": [
   {
  "readOnly": false,
  "containerPath": "/var/www/symfony/var/log",
  "sourceVolume": "shared_worker_logs"
   }
 ],
  "environment": [
    {
      "name": "${env_var_4_name}",
      "value": "${env_var_value_4}"
    },
    {
      "name": "${env_var_5_name}",
      "value": "${env_var_value_5}"
    },
    {
      "name": "${env_var_6_name}",
      "value": "${env_var_value_6}"
    },
    {
      "name": "${env_var_7_name}",
      "value": "${env_var_value_7}"
    },
    {
      "name": "${env_var_8_name}",
      "value": "${env_var_value_8}"
    },
    {
      "name": "${env_var_10_name}",
      "value": "${env_var_value_10}"
    },
    {
      "name": "${env_var_11_name}",
      "value": "${env_var_value_11}"
    },
    {
      "name": "${env_var_12_name}",
      "value": "${env_var_value_12}"
    },
    {
      "name": "${env_var_13_name}",
      "value": "${env_var_value_13}"
    },
    {
      "name": "${env_var_14_name}",
      "value": "${env_var_value_14}"
    },
    {
      "name": "${env_var_15_name}",
      "value": "${env_var_value_15}"
    },
    {
      "name": "${env_var_16_name}",
      "value": "${env_var_value_16}"
    },
    {
      "name": "${env_var_17_name}",
      "value": "${env_var_value_17}"
    },
    {
      "name": "${env_var_18_name}",
      "value": "${env_var_value_18}"
    },
    {
      "name": "${env_var_19_name}",
      "value": "${env_var_value_19}"
    },
    {
      "name": "${env_var_20_name}",
      "value": "${env_var_value_20}"
    },
    {
      "name": "${env_var_21_name}",
      "value": "${env_var_value_21}"
    },
    {
      "name": "${env_var_22_name}",
      "value": "${env_var_value_22}"
    },
    {
      "name": "${env_var_1_name}",
      "value": "${env_var_value_1}"
    },
    {
      "name": "${env_var_28_name}",
      "value": "${env_var_value_28}"
    },
    {
      "name": "${env_var_29_name}",
      "value": "${env_var_value_29}"
    },
    {
      "name": "${env_var_30_name}",
      "value": "${env_var_value_30}"
    },
    {
      "name": "${env_var_31_name}",
      "value": "${env_var_value_31}"
    },
    {
      "name": "${env_var_32_name}",
      "value": "${env_var_value_32}"
    },
    {
      "name": "${env_var_33_name}",
      "value": "${env_var_value_33}"
    },
    {
      "name": "${env_var_34_name}",
      "value": "${env_var_value_34}"
    },
    {
      "name": "${env_var_35_name}",
      "value": "${env_var_value_35}"
    },
    {
      "name": "${env_var_36_name}",
      "value": "${env_var_value_37}"
    },
    {
      "name": "${env_var_37_name}",
      "value": "${env_var_value_37}"
    },
    {
      "name": "${env_var_38_name}",
      "value": "${env_var_value_38}"
    }
  ],
  "secrets":[
    {
      "name":"${sensitive_var_1}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_1}"
    },
    {
      "name":"${sensitive_var_2}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_2}"
    },
    {
      "name":"${sensitive_var_3}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_3}"
    },
    {
      "name":"${sensitive_var_4}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_4}"
    },
    {
      "name":"${sensitive_var_5}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_5}"
    },
    {
      "name":"${sensitive_var_6}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_6}"
    },
    {
      "name":"${sensitive_var_7}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_7}"
    },
    {
      "name":"${sensitive_var_8}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_8}"
    },
    {
      "name":"${sensitive_var_9}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_9}"
    },
    {
      "name":"${sensitive_var_10}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_10}"
    },
    {
      "name":"${sensitive_var_11}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_11}"
    },
    {
      "name":"${sensitive_var_12}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_12}"
    },
    {
      "name":"${sensitive_var_13}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_13}"
    }
  ],
"logConfiguration" : {
  "logDriver" : "awslogs",
  "options" :{
    "awslogs-create-group": "true",
    "awslogs-group": "${container_name_php}",
    "awslogs-region": "${platform_region}",
    "awslogs-stream-prefix": "ecs"
  }
}
},

{
  "name": "${container_name_logstash}",
  "image": "${container_image_logstash}",
  "memory": ${container_memory},
  "cpu": ${container_cpu},
  "networkMode": "awsvpc",
  "volumesFrom": [],
  "essential": true,
  "portMappings": [
    {
      "containerPort": 9600,
      "hostPort": 9600,
      "protocol": "tcp"
    }
  ],
  "mountPoints": [
   {
  "readOnly": false,
  "containerPath": "/var/www/symfony/var/log",
  "sourceVolume": "shared_worker_logs"
   }
 ],
 "environment": [
   {
     "name": "${env_var_23_name}",
     "value": "${env_var_value_23}"
   },
   {
     "name": "${env_var_1_name}",
     "value": "${env_var_value_1}"
   }
 ],
"logConfiguration" : {
  "logDriver" : "awslogs",
  "options" :{
    "awslogs-create-group": "true",
    "awslogs-group": "${container_name_logstash}",
    "awslogs-region": "${platform_region}",
    "awslogs-stream-prefix": "ecs"
    }
  }
}
]

Example plan output - to me it isn't clear what exactly it needs to change which requires a forced replacement - it looks to remove vars that are already there and then re add them? in other places it seems to re order them also. It also looks to be adding the network mode (awsvpc) which is already defined in the task definition?

      ~ container_definitions    = jsonencode(
          ~ [ # forces replacement
              ~ {
                    cpu              = 512
                  ~ environment      = [
                      - {
                          - name  = "PHP_FPM_PORT"
                          - value = "9000"
                        },
                      - {
                          - name  = "PHP_FPM_HOST"
                          - value = "localhost"
                        },
                        {
                            name  = "APP_ENV"
                            value = "prod"
                        },
                      + {
                          + name  = "PHP_FPM_HOST"
                          + value = "localhost"
                        },
                      + {
                          + name  = "PHP_FPM_PORT"
                          + value = "9000"
                        },
                    ]
                    essential        = true
                    image            = "###########################:latest"
                    logConfiguration = {
                        logDriver = "awslogs"
                        options   = {
                            awslogs-create-group  = "true"
                            awslogs-group         = "###############"
                            awslogs-region        = "eu-west-1"
                            awslogs-stream-prefix = "ecs"
                        }
                    }
                    memory           = 1024
                    mountPoints      = [
                        {
                            containerPath = "/var/www/symfony/var/log"
                            readOnly      = false
                            sourceVolume  = "shared_symfony_logs"
                        },
                        {
                            containerPath = "/var/log/nginx"
                            readOnly      = false
                            sourceVolume  = "shared_nginx_logs"
                        },
                    ]
                    name             = "##################"
                  + networkMode      = "awsvpc"
                    portMappings     = [
                        {
                            containerPort = 443
                            hostPort      = 443
                            protocol      = "tcp"
                        },
                    ]
                    volumesFrom      = []
                } # forces replacement,

Expected Behavior

Terraform should not try and replace the task definitions on every plan.

Actual Behavior

Terraform forces replacement of the task definition on every plan.

Steps to Reproduce

Terraform plan

moyuanhuang commented 4 years ago

I'm having the same issue. I think it's ordering the custom environment variables, plus adding some default configurations if you don't already have them in your task definition. For my case I had to add these default options and reorder the environment variable according to the diff output.

It'd be nice if terraform can

  1. compare the environment variables as a real hash (so that the order doesn't matter)
  2. avoid updating the task definitions because of the absence of some default variables.
reedflinch commented 4 years ago

Going off of @moyuanhuang I also suspect the issue is the ordering of environment variables. I do NOT see the issue with secrets. One thing to note for my use case is I am changing the image of the task definition. So I do expect a new task definition to be created with the new image, but I do not expect to see a diff for unchanging environment variables.

This makes evaluating diffs for task definitions extremely difficult.

I notice the that AWS API + CLI do return these arrays in a consistent order (from what I can see), so perhaps this is something that Terraform or the provider itself is doing.

LeComptoirDesPharmacies commented 4 years ago

Hi, I having the same issue without mount points. In addition of reordering custom environment variables, I have some variable set at null who indicate terraform to recreate the task definition.

Exemple with docker health check :

~ healthCheck      = {
                        command     = [
                            "CMD-SHELL",
                            "agent health",
                        ]
                        interval    = 15
                        retries     = 10
                        startPeriod = 15
                      - timeout     = 5 -> null
                    }
moyuanhuang commented 4 years ago

@LeComptoirDesPharmacies That probably means there's a default value set for this particular config to be 5. However because you don't specify that config in your task definition so terraform thinks that you're trying to set it to null (which is never gonna happen since there is an enforced default). Add

timeout = 5

...to your task definition and you should be able to avoid terraform recreating the task.

LeComptoirDesPharmacies commented 4 years ago

@moyuanhuang Yes thanks. But I have the problems with custom environment variables too. I found this fix who is waiting to be merge :

jonesmac commented 4 years ago

I found that alphabetizing my env variables by name seems to keep it out of the plan. Noticed that the ecs task definition stores them that way in the json output in the console.

Drewster727 commented 3 years ago

I can confirm that alphabetizing like @jonesmac mentioned and adding in the items with their default values that terraform thinks has changed will resolve this, as a workaround.

aashitvyas commented 3 years ago

I have also got hit with this. The workaround suggested in this thread 1. ensure environment variables are in alphabetical order 2. ensure all the default values are filled in with its null/empty values worked for us for now. I still believe that , this is the terraform aws provider issue and still a bug to address for the future.

gRizzlyGR commented 3 years ago

The same thing happened using FluentBit log router. After adding its container definition, Terraform was forcing a new plan each time, even without touching anything:

~ {
    - cpu = 0 -> null
    - environment           = [] -> null
    - mountPoints           = [] -> null
    - portMappings          = [] -> null
    - user= "0" -> null
    - volumesFrom           = [] -> null
      # (6 unchanged elements hidden)
  },

After setting explicitly these values in the code, no more changes. Here's the FluentBit container definition:

{
  essential = true,
  image = var.fluentbit_image_url,
  name  = "log_router",
  firelensConfiguration = {
    type = "fluentbit"
  },
  logConfiguration : {
    logDriver = "awslogs",
    options = {
      awslogs-group = "firelens-container",
      awslogs-region= var.region,
      awslogs-create-group  = "true",
      awslogs-stream-prefix = "firelens"
    }
  },
  memoryReservation = var.fluentbit_memory_reservation
  cpu   = 0
  environment   = []
  mountPoints   = []
  portMappings  = []
  user  = "0"
  volumesFrom   = []
}
justinretzolk commented 2 years ago

Hey y'all 👋 Thank you for taking the time to file this issue, and for the continued discussion around it. Given that there's been a number of AWS provider releases since the last update here, can anyone confirm if you're still experiencing this behavior?

camway commented 2 years ago

@justinretzolk I can tell you this is still an issue. We've been hitting it for almost a year now, and I've made a concerted effort over the last week or so to address it. These are our versions (should be the latest at the time of posting this):

Terraform v1.0.11
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.64.2
+ provider registry.terraform.io/hashicorp/null v3.1.0
+ provider registry.terraform.io/hashicorp/random v3.1.0
+ provider registry.terraform.io/hashicorp/time v0.7.2
+ provider registry.terraform.io/hashicorp/tls v3.1.0

Still actively going through the workarounds above trying to get this to work correctly, but so far no dice.

aashitvyas commented 2 years ago

@justinretzolk This issue still persists in the latest AWS provider version and should definitely need to be addressed.

camway commented 2 years ago

It's been a little while since I last posted. I've attempted to fix this a few times since my last post, but I've had no success so far. To provide a little more information, this is one of the ECS tasks that's being effected (sorry for the sanitization):

-/+ resource "aws_ecs_task_definition" "task" {
      ~ arn                      = "arn:aws:ecs:AWS_REGION:AWS_ACCOUNT_ID:task-definition/AWS_ECS_TASK_NAME:317" -> (known after apply)
      ~ container_definitions    = jsonencode(
            [
              - {
                  - cpu                   = 0
                  - dockerLabels          = {
                      - traefik.frontend.entryPoints    = "https"
                      - traefik.frontend.passHostHeader = "true"
                      - traefik.frontend.rule           = "Host:MY_DNS_NAME"
                      - traefik.protocol                = "https"
                    }
                  - environment           = [
                      - {
                          - name  = "APP_PORT"
                          - value = "54321"
                        },
                    ]
                  - essential             = true
                  - image                 = "DOCKER_REPOR_URL:DOCKER_TAG"
                  - logConfiguration      = {
                      - logDriver = "awslogs"
                      - options   = {
                          - awslogs-group         = "AWS_LOGS_GROUP"
                          - awslogs-region        = "AWS_REGION"
                          - awslogs-stream-prefix = "AWS_STREAM_PREFIX"
                        }
                    }
                  - mountPoints           = [
                      - {
                          - containerPath = "/PATH/IN/CONTAINER/"
                          - sourceVolume  = "EFS_NAME"
                        },
                      - {
                          - containerPath = "/PATH/IN/CONTAINER"
                          - sourceVolume  = "EFS_NAME"
                        },
                    ]
                  - name                  = "SERVICE_NAME"
                  - portMappings          = [
                      - {
                          - containerPort = 443
                          - hostPort      = 443
                          - protocol      = "tcp"
                        },
                    ]
                  - repositoryCredentials = {
                      - credentialsParameter = "arn:aws:secretsmanager:AWS_REGION:AWS_ACCOUNT_ID:secret:SECRET_VERSION"
                    }
                  - startTimeout          = 120
                  - stopTimeout           = 120
                  - volumesFrom           = []
                },
            ]
        ) -> (known after apply) # forces replacement

This is part of the plan output immediately after an apply. One attempt I made recently was just focused on getting 'cpu' above to stop appearing. Adding a "cpu": 0, into the json for the container definition and reapplying has zero effect on the diff for future plan/applies.

Not sure what I'm doing wrong, but at this point we've begun dancing around the issue by using -target= during terraform applies so that it doesn't update all the time.

nutakkimurali commented 2 years ago

Hi Everyone,

While upgrading from TF version 0.11.x to version 1.0.10, we ran into a similar issue. Though alphabetizing and setting the default values with null/empty values worked, it's a cumbersome process to refactor the task definitions with so many parameters.  I believe that the revision number of the task definition is responsible for this behavior if we provision the task without a revision number. Consequently, I made a few changes to the service task definition in ECS to capture the revision number, which helped resolve the issue.


ECS Task Definition JSON File

[
  {
    "secrets": [
      {
        "name": "NRIA_LICENSE_KEY",
        "valueFrom": "arn:aws:ssm:${xxxxxx}:${xxxxxx}:xxxxxx/portal/${xxxxx}/NewrelicKey"        
      }
    ],
    "portMappings": [],
    "cpu": 200,
    "memory": ${ram},
    "environment": [
      {
        "name": "NRIA_OVERRIDE_HOST_ROOT",
        "value": "/host"
      },
      {
        "name": "ENABLE_NRI_ECS",
        "value": "true"
      },
      {
        "name": "NRIA_PASSTHROUGH_ENVIRONMENT",
        "value": "ECS_CONTAINER_METADATA_URI,ENABLE_NRI_ECS"
      },
      {
        "name": "NRIA_VERBOSE",
        "value": "0"
      },
      {
        "name": "NRIA_CUSTOM_ATTRIBUTES",
        "value": "{\"nrDeployMethod\":\"downloadPage\"}"
      }
    ],
    "mountPoints": [
      {
        "readOnly": true,
        "containerPath": "/host",
        "sourceVolume": "host_root_fs"
      },
      {
        "readOnly": false,
        "containerPath": "/var/run/docker.sock",
        "sourceVolume": "docker_socket"
      }
    ],
    "volumesFrom": [],
    "image": "${image}",
    "essential": true,
    "readonlyRootFilesystem": false,
    "privileged": true,
    "name": "${name}",
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "${awslogs_group}",
        "awslogs-region": "${aws_region}",
        "awslogs-stream-prefix": "${name}"
      }
    }
  }
]

resource "aws_ecs_task_definition" "newrelic_infra_agent" {
  family                   = "${var.workspace}-newrelic-infra-${var.env}"
  requires_compatibilities = ["EC2"]
  network_mode             = "host"
  cpu                      = "256"
  memory                   = "512"
  execution_role_arn       = var.ecs_task_role_arn
  container_definitions    = data.template_file.newrelic_infra_agent.rendered
  #tags                     = "${local.tags}"

  volume  {
    name      = "host_root_fs"
    host_path = "/"
  }

  volume  {
    name      = "docker_socket"
    host_path = "/var/run/docker.sock"
  }

resource "aws_ecs_task_definition" "newrelic_infra_agent" {
  family                   = "${var.workspace}-newrelic-infra-${var.env}"
  requires_compatibilities = ["EC2"]
  network_mode             = "host"
  cpu                      = "256"
  memory                   = "512"
  execution_role_arn       = var.ecs_task_role_arn
  container_definitions    = data.template_file.newrelic_infra_agent.rendered

  volume  {
    name      = "host_root_fs"
    host_path = "/"
  }

  volume  {
    name      = "docker_socket"
    host_path = "/var/run/docker.sock"
  }

}
data "aws_ecs_task_definition" "newrelic_infra_agent" {
  task_definition = "${aws_ecs_task_definition.newrelic_infra_agent.family}"
  depends_on      = [aws_ecs_task_definition.newrelic_infra_agent]
}

resource "aws_ecs_service" "newrelic_infra_agent" {
  name = "${var.workspace}-newrelic-infra-${var.env}"
  cluster = aws_ecs_cluster.ecs-cluster.id
  task_definition = "${aws_ecs_task_definition.newrelic_infra_agent.family}:${max("${aws_ecs_task_definition.newrelic_infra_agent.revision}", "${data.aws_ecs_task_definition.newrelic_infra_agent.revision}")}"
  scheduling_strategy = "DAEMON"
  #tags = "${local.tags}"
  propagate_tags  = "TASK_DEFINITION"

  depends_on = [aws_ecs_task_definition.newrelic_infra_agent]
}
camway commented 2 years ago

Recently re-tested on the latest AWS provider 4.10. Issue still seems to be present.

I did find another workaround for this though. It's not great, but I think it's better than what we'd been doing. Essentially it boils down to this:

This way standard plan/apply never causes the containers to restart (Unless some other attribute changed). If you need to force a restart/redeploy, taint, and then reapply.

This is by no means a request for this issue, but I'm beginning to wish there was a way to override the check for resource replacement. So you could provide a sha or something, and if that changes it will update. Would make this a lot easier.

eyalch commented 2 years ago

The same thing happened using FluentBit log router. After adding its container definition, Terraform was forcing a new plan each time, even without touching anything:

~ {
    - cpu = 0 -> null
    - environment           = [] -> null
    - mountPoints           = [] -> null
    - portMappings          = [] -> null
    - user= "0" -> null
    - volumesFrom           = [] -> null
      # (6 unchanged elements hidden)
  },

After setting explicitly these values in the code, no more changes. Here's the FluentBit container definition:

{
  essential = true,
  image = var.fluentbit_image_url,
  name  = "log_router",
  firelensConfiguration = {
    type = "fluentbit"
  },
  logConfiguration : {
    logDriver = "awslogs",
    options = {
      awslogs-group = "firelens-container",
      awslogs-region= var.region,
      awslogs-create-group  = "true",
      awslogs-stream-prefix = "firelens"
    }
  },
  memoryReservation = var.fluentbit_memory_reservation
  cpu   = 0
  environment   = []
  mountPoints   = []
  portMappings  = []
  user  = "0"
  volumesFrom   = []
}

For me, adding just user = "0" to the container definition resolved this. Here's the full container definition:

{
  essential = true
  image     = "public.ecr.aws/aws-observability/aws-for-fluent-bit:stable"
  name      = "log-router"

  firelensConfiguration = {
    type    = "fluentbit"
    options = {
      enable-ecs-log-metadata = "true"
      config-file-type        = "file"
      config-file-value       = "/fluent-bit/configs/parse-json.conf"
    }
  }

  logConfiguration = {
    logDriver = "awslogs"
    options   = {
      awslogs-group         = aws_cloudwatch_log_group.api_log_group.name
      awslogs-region        = local.aws_region
      awslogs-create-group  = "true"
      awslogs-stream-prefix = "firelens"
    }
  }

  memoryReservation = 50

  user = "0"
}
mo-saeed commented 2 years ago

I have same issue with the latest aws provider: 4.27.0

ghost commented 2 years ago

We also have this issue on terraform 1.2.7 and aws provider 4.31.0. The plan output is only marking arn, container_definitions, id and revision with ~ ('known after apply'), but after the container_definitions it says # forces replacement, but the content has not changed at all. We tried sorting the json keys and adding default parameters to no avail. Do we also need to format it exactly like the plan is saying? Because in the container_definitions its saying to delete all json keys.

With redactions:

~ container_definitions    = jsonencode(
            [
              - {
                  - cpu               = 64
                  - environment       = [
                      - ...
                    ]
                  - essential         = true
                  - image             = ...
                  - logConfiguration  = {
                      - logDriver = ...
                      - options   = {
                          - ...
                        }
                    }
                  - memoryReservation = 64
                  - mountPoints       = []
                  - name              = ...
                  - portMappings      = [
                      - ...
                    ]
                  - volumesFrom       = []
                },
              - {
                  - cpu               = 512
                  - environment       = [
                      - ...
                    ]
                  - essential         = true
                  - image             = ...
                  - logConfiguration  = {
                      - logDriver = ...
                      - options   = {
                          - ...
                        }
                    }
                  - memoryReservation = ...
                  - mountPoints       = []
                  - name              = ...
                  - portMappings      = [
                      - ...
                    ]
                  - secrets           = [
                      - ...
                    ]
                  - volumesFrom       = []
                },
            ]

No ENV variables changed, no secrets changed and no other configuration keys have been changed. If there is any way for me to help with debugging, please let me know.

ghost commented 1 year ago

A follow-up to my previous comment: The replacement was actually not caused by terraform not getting the diff correct. It was actually caused by a variable which was dependent on a file (data resource) which had a depends_on to a null_resource. Even in the documentation for depends_on it is stated that terraform is more conservative and plans to replace more resources as is possibly needed. So in the end the ordering and filling all values with default values worked.

Our null_resource had the trigger set to always. This of course makes the null_resource 'dirty' on every terraform run and I suspect that the other dependent resource then also get tagged as dirty in a transient fashion.

trallnag commented 1 year ago

Still an issue with 4.63.0. Setting values to the defaults or null helps.

OscarGarciaF commented 1 year ago

In my case I had:

 portMappings = [
    {
      containerPort = "27017"
      protocol      = "TCP"
      hostPort      = "27017"
    },

TCP was being evaluated as tcp and on the second run terraform was not smart enough to recognize that "TCP" was going to be evaluated as "tcp" and was trying to replace the definition, I changed "TCP" for "tcp" and it stopped trying to replace it.

adamdepollo commented 11 months ago

Still an issue with 5.16.1

AlbertCintas commented 5 months ago

In my case, the recreation was caused by the healthcheck definition. I did add to the config block the default values for interval, retries and such, and the problem was solved:

healthcheck = {
        command     = ["CMD-SHELL", "curl -f http://127.0.0.1/ || exit 1"]
        interval    = 30
        retries     = 3
        startPeriod = 5
        timeout     = 5
      }
t0yv0 commented 1 month ago

I've submitted a PR for the healthcheck defaults normalization specifically: https://github.com/hashicorp/terraform-provider-aws/pull/38872