hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.75k stars 9.1k forks source link

default_tags always shows an update #18311

Closed acdha closed 1 year ago

acdha commented 3 years ago

Description

I have been looking forward to the default tagging support and tested it on a project yesterday which uses https://github.com/terraform-aws-modules/terraform-aws-vpc/ β€” this immediately showed some tags on aws_vpc and aws_subnet resources as having been changed, but only if another unrelated change was also present.

Community Note

Terraform CLI and Terraform AWS Provider Version

Terraform v0.14.8
+ provider registry.terraform.io/hashicorp/aws v3.33.0
+ provider registry.terraform.io/hashicorp/dns v3.1.0
+ provider registry.terraform.io/hashicorp/http v2.1.0

Affected Resource(s)

Terraform Configuration Files

provider "aws" {
  region = var.region

  default_tags {
    tags = local.tags
  }
}

Debug Output

Expected Behavior

No changes would be displayed

Actual Behavior

If an unrelated resource triggers a diff, all of the subnet and VPC resources will show an update in-place diff showing the tags which are already present. Curiously, in my project it lists 5 tags which are present as having been changed but then display a β€œ1 unchanged element hidden”

  # module.vpc.aws_subnet.public[0] will be updated in-place
  ~ resource "aws_subnet" "public" {
        id                              = "subnet-07620b925b0c70066"
      ~ tags                            = {
          + "Environment"        = "Development"
          + "Project"            = "…"
          + "ResponsibleParty"   = "…"
          + "Terraform"          = "true"
          + "TerraformWorkspace" = "…"
            # (1 unchanged element hidden)
        }
        # (10 unchanged attributes hidden)
    }

Steps to Reproduce

  1. terraform apply

References

anGie44 commented 3 years ago

Hi @acdha, thank you raising this issue! do you mind providing additional configuration details regarding the unrelated resource [that] triggers a diff (and if possible, what event triggered the diff) as well as the configuration for the aws_subnet resource?

acdha commented 3 years ago

Hi @acdha, thank you raising this issue! do you mind providing additional configuration details regarding the unrelated resource [that] triggers a diff (and if possible, what event triggered the diff) as well as the configuration for the aws_subnet resource?

Basically if nothing else has changed, Terraform will show no changes and naturally no diff. If any other resource has changed when it triggers the plan display for that resource it will also include the resources listed above.

The subnet configuration is here:

https://github.com/terraform-aws-modules/terraform-aws-vpc/blob/997cba4053bd8b4a5d2aed528073b8f02c013e93/main.tf#L367-L420

My invocation of that is somewhat complicated:

```hcl module "vpc" { source = "terraform-aws-modules/vpc/aws" version = "~> 2.69" name = local.deployment_id tags = local.tags cidr = local.vpc_cidr_allocation azs = slice(data.aws_availability_zones.available.names, 0, 4) public_subnets = cidrsubnets(local.public_cidr, 2, 2, 2, 2) private_subnets = cidrsubnets(local.private_cidr, 2, 2, 2, 2) enable_nat_gateway = true single_nat_gateway = false one_nat_gateway_per_az = true enable_dns_hostnames = true nat_eip_tags = { Name = "${local.deployment_id} NAT Gateway" } igw_tags = { Name = "${local.deployment_id} Public Internet Gateway" } # We won't use the default network ACL but we want to make sure that it has # a safe configuration by default just in case new resources are launched in # it for any reason: manage_default_network_acl = true default_network_acl_name = "${local.deployment_id}-default" # Leaving these as empty lists will result in no rules being applied, leaving # the AWS default deny-all rules: default_network_acl_ingress = [] default_network_acl_egress = [] public_dedicated_network_acl = true private_dedicated_network_acl = true database_dedicated_network_acl = true public_inbound_acl_rules = concat( … ) public_outbound_acl_rules = concat( … ) private_inbound_acl_rules = concat( … ) private_outbound_acl_rules = concat( … ) enable_s3_endpoint = true enable_efs_endpoint = true enable_ses_endpoint = true efs_endpoint_private_dns_enabled = true efs_endpoint_security_group_ids = [aws_security_group.vpc_endpoints.id] ses_endpoint_private_dns_enabled = true ses_endpoint_security_group_ids = [aws_security_group.vpc_endpoints.id] ses_endpoint_subnet_ids = data.aws_subnet_ids.private_subnets.ids } ```

The resulting resources look like this:

resource "aws_subnet" "public" {
    arn                             = "arn:aws:ec2:us-east-1:…:subnet/subnet-…"
    assign_ipv6_address_on_creation = false
    availability_zone               = "us-east-1d"
    availability_zone_id            = "use1-az1"
    cidr_block                      = "…/26"
    id                              = "subnet-…"
    map_customer_owned_ip_on_launch = false
    map_public_ip_on_launch         = true
    owner_id                        = "…"
    tags                            = {
        "Name" = "…-public-us-east-1d"
    }
    tags_all                        = {
        "Environment"        = "Development"
        "Name"               = "…-public-us-east-1d"
        "Project"            = "…"
    }
    vpc_id                          = "vpc-…"
}

I notice this a lot because of #14892 β€” this project also has some IAM policies which are defined using the ARNs of an ECS cluster and that means that those always show an update in-place even though all of the information is available at plan time.

  # module.collection_access.data.aws_iam_policy_document.ecs_update_service will be read during apply
  # (config refers to values not yet known)
 <= data "aws_iam_policy_document" "ecs_update_service"  {
      ~ id      = "83089849" -> (known after apply)
      ~ json    = jsonencode(
            {
              - Statement = [
                  - {
                      - Action   = "ecs:UpdateService"
                      - Effect   = "Allow"
                      - Resource = "arn:aws:ecs:us-east-1:…:service/…/collection-access"
                      - Sid      = ""
                    },
                ]
              - Version   = "2012-10-17"
            }
        ) -> (known after apply)
      - version = "2012-10-17" -> null

      ~ statement {
          - effect        = "Allow" -> null
          - not_actions   = [] -> null
          - not_resources = [] -> null
            # (2 unchanged attributes hidden)
        }
    }

  # module.collection_access.aws_iam_policy.ecs_update_service will be updated in-place
  ~ resource "aws_iam_policy" "ecs_update_service" {
        id     = "arn:aws:iam::630942203890:policy/…-ECS-updatecollection-access-service"
        name   = "…-ECS-updatecollection-access-service"
      ~ policy = jsonencode(
            {
              - Statement = [
                  - {
                      - Action   = "ecs:UpdateService"
                      - Effect   = "Allow"
                      - Resource = "arn:aws:ecs:us-east-1:630942203890:service/…/collection-access"
                      - Sid      = ""
                    },
                ]
              - Version   = "2012-10-17"
            }
        ) -> (known after apply)
        # (2 unchanged attributes hidden)
    }
bflad commented 3 years ago

Hi @acdha πŸ‘‹ Does the difference disappear if you no longer declare tags = local.tags in the module block? I'm able to reproduce this behavior with the following:

# main.tf

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "3.36.0"
    }
  }
  required_version = ">= 0.14.10"
}

locals {
  tags = {
    sometagkey = "sometagvalue"
  }
}

provider "aws" {
  region = "us-east-2"

  default_tags {
    tags = local.tags
  }
}

module "test" {
  source = "./mod"

  providers = {
    aws = aws
  }

  name = "defaulttagstest"
  tags = local.tags
}

# mod/main.tf

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 3.33.0"
    }
  }
  required_version = ">= 0.14.10"
}

variable "name" {
  description = "Name to be used on all the resources as identifier"
  type        = string
  default     = ""
}

variable "tags" {
  description = "A map of tags to add to all resources"
  type        = map(string)
  default     = {}
}

resource "aws_vpc" "this" {
  cidr_block = "10.0.0.0/16"

  tags = merge(
    {
      "Name" = format("%s", var.name)
    },
    var.tags,
  )
}

resource "aws_subnet" "this" {
  cidr_block = cidrsubnet(aws_vpc.this.cidr_block, 8, 0)
  vpc_id     = aws_vpc.this.id

  tags = merge(
    {
      "Name" = format("%s", var.name)
    },
    var.tags,
  )
}

Applying twice:

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.test.aws_subnet.this will be updated in-place
  ~ resource "aws_subnet" "this" {
        id                              = "subnet-00939145cc72a2791"
      ~ tags                            = {
          + "sometagkey" = "sometagvalue"
            # (1 unchanged element hidden)
        }
        # (10 unchanged attributes hidden)
    }

  # module.test.aws_vpc.this will be updated in-place
  ~ resource "aws_vpc" "this" {
        id                               = "vpc-07ad2aaf41bd4fada"
      ~ tags                             = {
          + "sometagkey" = "sometagvalue"
            # (1 unchanged element hidden)
        }
        # (13 unchanged attributes hidden)
    }

Plan: 0 to add, 2 to change, 0 to destroy.

The plan difference here is expected in this case as provider-level default tags are removed from the underlying resource tags value automatically during refresh if the tag key and value exactly match. This is happening since the tags configuration is still being passed via the module tags variable which is eventually merged that into the resource block tags argument, making the tags configuration exactly redundant. While we could theoretically detect and handle this situation, it would hide this type of configuration issue and require logic in every resource implementation.

acdha commented 3 years ago

I think that's correct. If I remove the tags option, I see diffs to remove the tags from resources which I guess do not yet support the provider default tags? The list notably does not include the aws_vpc or aws_subnet resources which are currently tagged like everything else.

aws_default_network_acl
aws_eip
aws_internet_gateway
aws_nat_gateway
aws_network_acl
aws_route_table
aws_vpc_endpoint
bflad commented 3 years ago

Correct, the most recent version only includes default_tags functionality for the aws_vpc and aws_subnet resources for experimentation. We are preparing the rest of the resources now: https://github.com/hashicorp/terraform-provider-aws/pulls?q=is%3Apr+is%3Aopen+%22support+default+tags%22 or can be holistically tracked in the issue: https://github.com/hashicorp/terraform-provider-aws/issues/7926

acdha commented 3 years ago

That makes sense β€” it's a bit messy during the transition period but since what I already works the easiest thing is just to leave it alone until default_tags supports all of the other resource types.

acdha commented 3 years ago

I'm inclined to say this can be closed with 3.38.0 since the support for all of the resources types allowed me to make a nice de-boilerplate commit. The only wart I ran into was aws_instance volume_tags (#19188).

devopsrick commented 3 years ago

The lack of support for volume_tags makes this a pretty painful problem for us.

We currently use the same variable passed in most modules to populate tags and volume_tags, so if we want volume_tags to be populated we will have overlapping tags/tags_all which causes this messy output. Without a ton of module rewrites this means we cannot use default_tags at all. Either this bug or the fact that volume_tags doesn't inherit default_tags really should really be addressed.

The benefits of default_tags is huge, but it will be very painful for us to adopt if one of those two bugs isn't addressed to ease the transition. We cannot have this constant messy plan output (the json output of 'unknown' values is the real pain point for us, it has even exposed secrets), and we cannot have volume_tags not populated, so we are stuck not using default_tags at all.

acdha commented 3 years ago

We currently use the same variable passed in most modules to populate tags and volume_tags, so if we want volume_tags to be populated we will have overlapping tags/tags_all which causes this messy output. Without a ton of module rewrites this means we cannot use default_tags at all. Either this bug or the fact that volume_tags doesn't inherit default_tags really should really be addressed.

I took that rewrite path which wasn't especially terrible for a small project but it meant that I had to keep passing that variable around. It would have been really nice to have a way to get the default_tags attribute from the provider so you could write that code generically.

olenm commented 3 years ago

Still a major annoyance in aws v3.45.0 - I am surprised this is not marked as a bug? I'm seeing a default_tag being overwritten (with the same value in some cases) and TF wants to make changes even though the tag exists on the aws-resource and no changes should be applicable.

affected resources (so far):

kylelaverty commented 3 years ago

If I understand the situation correctly, this is caused by having a tag's name match in the default_tag and tag collections. This is something that is easily fixed if you own the module but might be impossible if you are making use of other people's modules. This should be something that shows up when validate is run as an info level message, just to let people know.

olenm commented 3 years ago

Well default-tags should be the first layer, any other tags being set should act as an override to a default-tag of the same name-key, and if the value is the same and set already, should not be displayed as a change (where it currently thinks a change is needed). If the default-tag is being overridden by an explicitly set tag, then yes show an info message of some sort.

The problem is enhanced in production environments where TF is claiming a change is to be made, but no changes occur - this is the scary part as the larger your project folder grows, the more potential noise TF shows as "changing" making it harder to see potential mistakes.

vinicius73 commented 3 years ago

Some situation here. Any news about that?

rhenning commented 3 years ago

@olenm observed:

... default-tags should be the first layer, any other tags being set should act as an override to a default-tag of the same name-key, and if the value is the same and set already, should not be displayed as a change (where it currently thinks a change is needed). If the default-tag is being overridden by an explicitly set tag, then yes show an info message of some sort.

on the surface what @olenm suggests feels like the most desirable behavior to me -- the implicit equivalent of merge(default_tags{}, resource_tags{}), with a tfplan diff only if that differs from what is in state/refresh. while this seems easy to reason about in the abstract, i recall seeing somewhere that this may be difficult to put into practice because of the way the two are merged in state. can someone shine some light on this?

and also:

The problem is enhanced in production environments where TF is claiming a change is to be made, but no changes occur ... the larger your project folder grows, the more potential noise TF shows as "changing" making it harder to see potential mistakes.

indeed, this is what we're struggling with at the moment. a bunch of reusable inner library modules, composed together by app engineering teams in root modules to build their own unique, opinionated infrastructures. we're in the transition phase of getting everyone on provider >~3.40, using default_tags, and updating the inner modules to remove the duplication, but the signal:noise ratio of tfplan is seriously degraded.

maybe worse, if i understand correctly, this isn't just noise. TF will call the AWS API and perform a CRUD action for every affected resource, despite the fact that nothing has changed. Yes, the modify should end up as a noop, but the behavior wastes network bandwidth, consumes API quota, slows down the apply phase, and increases the chances of something going wrong.

this contrived example below illustrates the issue, though IRL the resource-level tags might be managed within a module. the EC2 API is called with a modify each time plan/apply is executed, though nothing has changed.

provider "aws" {
  region = "us-east-1"

  default_tags {
    tags = {
      foo = "bar"
      baz = "biddy"
    }
  }
}

data "aws_vpc" "_" {
  default = true
}

resource "aws_security_group" "_" {
  vpc_id = data.aws_vpc._.id
  tags = {
    baz  = "biddy"
    beep = "boop"
  }
}
jakubigla commented 2 years ago

Is this being looked over? Absolutely annoying issue which is stopping me using default_tags...

brightshine1111 commented 2 years ago

Still seeing this with aws provider v3.60.0 and Terraform 0.15.3 for resources:

Even when I configure the provider to ignore all the tag keys, e.g.:

locals {
  tags = {
    foo = "bar"
    baz = "bop"
  }
}

provider "aws" {
  ignore_tags {
    keys = ["foo", "baz"]
  }
}

resource "aws_ssm_parameter" "this" {
  ...
  tags = local.tags
}

the plan still shows this:

Terraform will perform the following actions:

  # aws_ssm_parameter.this will be updated in-place
  ~ resource "aws_ssm_parameter" "this" {
      ~ tags        = {} -> (known after apply)
        # (8 unchanged attributes hidden)
    }
share-me commented 2 years ago

Ok there is no changes, but still it's unusable in production. Our operators are in panic mode when this occurs.

  # module.vault.aws_secretsmanager_secret.vault[0] has been changed
  ~ resource "aws_secretsmanager_secret" "vault" {
        id                             = "arn:aws:secretsmanager:eu-west-3:0000000000:secret:vault_prod-1d24E12"
        name                           = "vault_prod"
      + tags                           = {}
        # (6 unchanged attributes hidden)
    }
(...)
No changes. Your infrastructure matches the configuration.
(...)

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
DemiAHMS commented 2 years ago

Just found this bug when attempting to manage aws_iam_user.

$ terraform version
Terraform v1.0.8
on darwin_amd64
+ provider registry.terraform.io/hashicorp/aws v3.61.0

This is the entirety of the terraform config (for this set of resources, see notes at the bottom):

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 3.0"
    }
  }
  backend "s3" {
    # The S3 bucket where state-files are kept
    bucket = "terraform-statefiles.example.com"

    # DynamoDB tables where the lock is kept
    key = "live/global/iam/terraform.tfstate"

    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

provider "aws" {
  region = "us-east-1"

  default_tags {
    tags = {
      created_by   = "me"
      created_tool = "terraform"
      created_date = formatdate("YYYY-MM-DD", timestamp())
      team         = "..."
      owner        = "team-..."
    }
  }

  ignore_tags {
    keys = ["created_date"]
  }
}

### This is the account alias for the whole account
resource "aws_iam_account_alias" "alias" {
  account_alias = "...example..."
}

resource "aws_iam_user" "root_iam_user" {
  name = "root_iam_user"
}

### IAM group for the admins
resource "aws_iam_group" "root_accounts" {
  name = "root_accounts"
}
resource "aws_iam_group_membership" "group_root_accounts_membership" {
  name = "group_root_accounts_membership"

  users = [
    aws_iam_user.root_iam_user.name,
  ]

  group = aws_iam_group.root_accounts.name
}

resource "aws_iam_group_policy_attachment" "admin_access" {
  group      = aws_iam_group.root_accounts.name
  policy_arn = "arn:aws:iam::aws:policy/AdministratorAccess"
}
resource "aws_iam_group_policy_attachment" "billing_access" {
  group      = aws_iam_group.root_accounts.name
  policy_arn = "arn:aws:iam::aws:policy/job-function/Billing"
}

Gives output:

$ terraform fmt ; terraform plan

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

  # aws_iam_user.root_iam_user has been changed
  ~ resource "aws_iam_user" "root_iam_user" {
        id            = "root_iam_user"
        name          = "root_iam_user"
      ~ tags          = {
          + "created_by"   = "me"
          + "created_tool" = "terraform"
          + "owner"        = "team-..."
          + "team"         = "..."
        }
        # (5 unchanged attributes hidden)
    }

Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo or respond to
these changes.

──────────────

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # aws_iam_user.root_iam_user will be updated in-place
  ~ resource "aws_iam_user" "root_iam_user" {
        id            = "root_iam_user"
        name          = "root_iam_user"
      ~ tags          = {
          + "created_date" = "2021-10-04"
            # (4 unchanged elements hidden)
        }
        # (5 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

And this persists even with terraform apply with yes, and immediately do it again.

Notes about my environment:

Update: This issue persists, even when I add ALL the tags to ignore_tags.

Update 2: I retried tfplan / tfapply on my other modules that had appeared fine, and now they show the same behavior mentioned here

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

So it appears that the first time I setup with default_tags / ignore_tags, it works as intended, and repeated times show the anomalous behavior.

Grasume commented 2 years ago

This does not just Show with ECS but Multiple different services , From ECS to lb and rds dbs

rastakajakwanna commented 2 years ago

I've resolved constantly drifting tags for one project just to face it again with EKS managed node groups with launch template (in order to enforce security&compliance&cost-visibility tags propagation on volumes, network interfaces and instances).

So, let me join this group and say: I am facing this too and I had to fall back to per-resource tagging in order to avoid constant drift on tens of resources and constant updates to launch template and worker nodes replacements.

As the audience here, I would also expect precedence of per-resource tags if they have the same key as provider default_tags instead of the constant drift. "Default" means to me "set if unset".

Or, in order to satisfy the other use cases, tags priority number could be introduced. That would allow us to set our own precedence of the tag keys, while default behavior would be hierarchy based.

devopsrick commented 2 years ago

A very hacky way to stop the constant tag flapping caused by overlapping key/value pairs: tags = merge({ for k, v in var.tags : k => v if lookup(data.aws_default_tags.common_tags.tags, k, "") != v })

requires: data "aws_default_tags" "common_tags" {}

dtiziani commented 2 years ago

I'm seeing the same issue (the change always appears as a drift in state) for aws_s3_bucket: image

SPSeanLong commented 2 years ago

Also seeing this issue. The original terraform plan shows that the eip has "tags_all" attribute with all the correct default tag values but when eip is actually created they are missing. These missing tags then lead to the next run of terraform apply triggering an update which will successfully apply the same tags.

ggriffin commented 2 years ago

We're also seeing something similar.

provider "aws" {
  profile = var.aws_profile
  region  = var.aws_region
  default_tags {
    tags = {
      Project     = var.project
      Environment = var.environment
      Owner       = var.owner
    }
  }
}

Output from the second terraform apply:

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

  # aws_ssm_parameter.db_host has been changed
  ~ resource "aws_ssm_parameter" "db_host" {
        id          = "/example/dev/us-east-2/host"
        name        = "/example/dev/us-east-2/host"
      + tags        = {}
        # (9 unchanged attributes hidden)
    }

After which, the default tags still exist on the resource.

mherrmann commented 2 years ago

I'm experiencing this as well. But only for resources that override the default_tags I set on the provider.

rhenning commented 2 years ago

has anyone tried to repro this to see if behavior is similar under tf v1.x with either the upcoming v4 release of the provider or a build from the head of trunk?

vladholubiev commented 2 years ago

I've solved this by duplicating my tags with tags_all on resources where I extend default_tags with custom ones

tags      = local.tags
+tags_all = local.tags
DerFels commented 2 years ago

has anyone tried to repro this to see if behavior is similar under tf v1.x with either the upcoming v4 release of the provider or a build from the head of trunk?

Just did another plan today with Terraform v1.0.11 and hashicorp/aws v3.47.0 and the Problem still exists: Plan: 7 to add, 76 to change, 2 to destroy.

~ tags                  = {
          + "Team"        = "sys"
          + "Terraform"   = "true"
            # (3 unchanged elements hidden)
        }

sorry, I read too fast. Didn't try it with a head or trunk build from v4.

kingindanord commented 2 years ago

have the same issue, using Terraform v1.0.11 hashicorp/aws v3.66.0

DemiAHMS commented 2 years ago

Another report, this time, without ever using default_tags!

main.tf

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 3.0"
    }
  }
}

provider "aws" {
  region = var.deploy_region
}

### VPC setup
resource "aws_vpc" "something" {
  cidr_block = "10.0.0.0/16"
  tags = {
    Name         = "something_vpc"
    created_by   = var.whoami
    created_tool = "terraform"
    created_date = local.todays_date
    team         = "systems"
    owner        = "team-systems"
  }
}

resource "aws_subnet" "something_subnet" {
  vpc_id     = aws_vpc.something.id
  cidr_block = "10.0.0.0/16"

  tags = {
    Name         = "something_subnet"
    created_by   = var.whoami
    created_tool = "terraform"
    created_date = local.todays_date
    team         = "systems"
    owner        = "team-systems"
  }
}

resource "aws_security_group" "something_sg" {
  name                   = "something_sg"
  description            = "For handling all connections to something"
  vpc_id                 = aws_vpc.something.id
  revoke_rules_on_delete = true

  ingress {
    description = "SSH"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    description      = "From self"
    self             = true
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }

  tags = {
    Name         = "something_sg"
    created_by   = var.whoami
    created_tool = "terraform"
    created_date = local.todays_date
    team         = "systems"
    owner        = "team-systems"
  }
}

variables.tf

variable "deploy_region" {
  description = "The region to deploy the instance into. Override for testing."
  type        = string
  default     = "us-east-1"
}

variable "source_ami_id" {
  description = "The AMI id of the source image that we created via packer"
  type        = string
}

variable "whoami" {
  description = "My name, that is, the user launching this terraform run, for putting into the tags, e.g. first_last"
  type        = string
}

locals.tf

locals {
  todays_date = formatdate("YYYY-MM-DD", timestamp())
}

Here are all three outputs, all run immediately, no one else has access to any of this, nothing removed but the command prompt:

$ terraform init

Initializing the backend...

Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 3.0"...
- Installing hashicorp/aws v3.68.0...
- Installed hashicorp/aws v3.68.0 (signed by HashiCorp)

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
$ terraform fmt . ; terraform apply -var 'whoami=foo_bar' -var 'deploy_region=us-east-2' -var 'source_ami_id=ami-0d8c0c76ee7a663db'

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # aws_security_group.something_sg will be created
  + resource "aws_security_group" "something_sg" {
      + arn                    = (known after apply)
      + description            = "For handling all connections to something"
      + egress                 = [
          + {
              + cidr_blocks      = [
                  + "0.0.0.0/0",
                ]
              + description      = ""
              + from_port        = 0
              + ipv6_cidr_blocks = [
                  + "::/0",
                ]
              + prefix_list_ids  = []
              + protocol         = "-1"
              + security_groups  = []
              + self             = false
              + to_port          = 0
            },
        ]
      + id                     = (known after apply)
      + ingress                = [
          + {
              + cidr_blocks      = [
                  + "0.0.0.0/0",
                ]
              + description      = "From self"
              + from_port        = 0
              + ipv6_cidr_blocks = [
                  + "::/0",
                ]
              + prefix_list_ids  = []
              + protocol         = "-1"
              + security_groups  = []
              + self             = true
              + to_port          = 0
            },
          + {
              + cidr_blocks      = [
                  + "0.0.0.0/0",
                ]
              + description      = "SSH"
              + from_port        = 22
              + ipv6_cidr_blocks = []
              + prefix_list_ids  = []
              + protocol         = "tcp"
              + security_groups  = []
              + self             = false
              + to_port          = 22
            },
        ]
      + name                   = "something_sg"
      + name_prefix            = (known after apply)
      + owner_id               = (known after apply)
      + revoke_rules_on_delete = true
      + tags                   = (known after apply)
      + tags_all               = (known after apply)
      + vpc_id                 = (known after apply)
    }

  # aws_subnet.something_subnet will be created
  + resource "aws_subnet" "something_subnet" {
      + arn                             = (known after apply)
      + assign_ipv6_address_on_creation = false
      + availability_zone               = (known after apply)
      + availability_zone_id            = (known after apply)
      + cidr_block                      = "10.0.0.0/16"
      + id                              = (known after apply)
      + ipv6_cidr_block_association_id  = (known after apply)
      + map_public_ip_on_launch         = false
      + owner_id                        = (known after apply)
      + tags                            = (known after apply)
      + tags_all                        = (known after apply)
      + vpc_id                          = (known after apply)
    }

  # aws_vpc.something will be created
  + resource "aws_vpc" "something" {
      + arn                            = (known after apply)
      + cidr_block                     = "10.0.0.0/16"
      + default_network_acl_id         = (known after apply)
      + default_route_table_id         = (known after apply)
      + default_security_group_id      = (known after apply)
      + dhcp_options_id                = (known after apply)
      + enable_classiclink             = (known after apply)
      + enable_classiclink_dns_support = (known after apply)
      + enable_dns_hostnames           = (known after apply)
      + enable_dns_support             = true
      + id                             = (known after apply)
      + instance_tenancy               = "default"
      + ipv6_association_id            = (known after apply)
      + ipv6_cidr_block                = (known after apply)
      + main_route_table_id            = (known after apply)
      + owner_id                       = (known after apply)
      + tags                           = (known after apply)
      + tags_all                       = (known after apply)
    }

Plan: 3 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_vpc.something: Creating...
aws_vpc.something: Creation complete after 3s [id=vpc-0734d47fd181a51e1]
aws_subnet.something_subnet: Creating...
aws_security_group.something_sg: Creating...
aws_subnet.something_subnet: Creation complete after 1s [id=subnet-02592914dbc54fd51]
aws_security_group.something_sg: Creation complete after 2s [id=sg-04cee30281953dad4]

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
$ terraform fmt . ; terraform apply -var 'whoami=foo_bar' -var 'deploy_region=us-east-2' -var 'source_ami_id=ami-0d8c0c76ee7a663db'
aws_vpc.something: Refreshing state... [id=vpc-0734d47fd181a51e1]
aws_subnet.something_subnet: Refreshing state... [id=subnet-02592914dbc54fd51]
aws_security_group.something_sg: Refreshing state... [id=sg-04cee30281953dad4]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # aws_security_group.something_sg will be updated in-place
  ~ resource "aws_security_group" "something_sg" {
        id                     = "sg-04cee30281953dad4"
        name                   = "something_sg"
      ~ tags                   = {
          - "Name"         = "something_sg"
          - "created_by"   = "foo_bar"
          - "created_date" = "2021-12-08"
          - "created_tool" = "terraform"
          - "owner"        = "team-systems"
          - "team"         = "systems"
        } -> (known after apply)
      ~ tags_all               = {
          - "Name"         = "something_sg"
          - "created_by"   = "foo_bar"
          - "created_date" = "2021-12-08"
          - "created_tool" = "terraform"
          - "owner"        = "team-systems"
          - "team"         = "systems"
        } -> (known after apply)
        # (7 unchanged attributes hidden)
    }

  # aws_subnet.something_subnet will be updated in-place
  ~ resource "aws_subnet" "something_subnet" {
        id                              = "subnet-02592914dbc54fd51"
      ~ tags                            = {
          - "Name"         = "something_subnet"
          - "created_by"   = "foo_bar"
          - "created_date" = "2021-12-08"
          - "created_tool" = "terraform"
          - "owner"        = "team-systems"
          - "team"         = "systems"
        } -> (known after apply)
      ~ tags_all                        = {
          - "Name"         = "something_subnet"
          - "created_by"   = "foo_bar"
          - "created_date" = "2021-12-08"
          - "created_tool" = "terraform"
          - "owner"        = "team-systems"
          - "team"         = "systems"
        } -> (known after apply)
        # (9 unchanged attributes hidden)
    }

  # aws_vpc.something will be updated in-place
  ~ resource "aws_vpc" "something" {
        id                               = "vpc-0734d47fd181a51e1"
      ~ tags                             = {
          - "Name"         = "something_vpc"
          - "created_by"   = "foo_bar"
          - "created_date" = "2021-12-08"
          - "created_tool" = "terraform"
          - "owner"        = "team-systems"
          - "team"         = "systems"
        } -> (known after apply)
      ~ tags_all                         = {
          - "Name"         = "something_vpc"
          - "created_by"   = "foo_bar"
          - "created_date" = "2021-12-08"
          - "created_tool" = "terraform"
          - "owner"        = "team-systems"
          - "team"         = "systems"
        } -> (known after apply)
        # (12 unchanged attributes hidden)
    }

Plan: 0 to add, 3 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Nothing triggered this. Nothing could possibly have triggered this change, since no one else can access this test space.

Could this be the same problem as the unordered-lists-always-cause-drift issue (bug #11801 )?

devopsrick commented 2 years ago

DemiAHMS

I think in your particular example it is from the local var timestamp() function which cannot be known before apply. If you want to not update that value every run (I assume not from the tag key) you should add something like lifecycle { ignore_changes = [ tags["created_date"] ] }

DemiAHMS commented 2 years ago

So why does it change if the computed values are the same? Why would the other static tags be shown to have drift?

devopsrick commented 2 years ago

Why all of them show as being removed and "{} -> (known after apply)" is definitely a good question, but not really related the bug this issue is discussing. I would open a new issue for that particular example.

tovbinm commented 2 years ago

Yes, the problem is still there. No fixes were made to address this issue.

efernandes-dev-ops commented 2 years ago

Not sure if this helps anyone, I was able to get around this issue by not having the same tag defined in the default_tags.

For example, I had an environment tag in default_tags block and I also was setting an environment tag in a resource. I removed the environment tag from the resource and only defined in the default_tags block and that did the trick for me, terraform stopped picking up the change.

2rs2ts commented 2 years ago

Not sure if this is related, but with terraform v0.12.31 and provider v.3.73.0 we see that any resource that has one of the tags in default_tags also in its own tags will show a permadiff where tags_all is not changing, but tags is trying to add the redundant tag in question. This diff always shows, no matter what.

This problem goes away if we do not specify the redundant tag, but in a complex organization such as ours, that's a tough sell, because we have custom remote modules that we'd like to enforce these tags in, even if a consumer forgets to set up default tags.

~ tags                   = {
    + "Environment"               = "dev"
      "Name"                      = "node-identifier"
  }
  tags_all               = {
      "Environment"               = "dev"
      "Kubernetes"                = "true"
      "Name"                      = "node-identifier"
  }
olenm commented 2 years ago

@2rs2ts TF is supposed to error/warn when you have something like tags = { key = value } and default_tags = { key = value } - however it will not constantly show the error depending on how complex the map is that creates each. I feel its a very shallow check.
Testing the priority of such a case (as I thought this was normal until making no variables for TF to interpret) is where we see "default_tags" win if there is a difference between tags and default_tags -> in all reality it should perhaps instead called enforced_tags - but we know that it does not properly tag all resources with tags, lol.

Your best solution is to use a data-object and check which keys are in default_tags and remove them from your components tags (based on if the key-names match) - and know that anything within default-tags will always take precedence if there is a value-difference.

@devopsrick handles it very well, here (good luck if you have more than one provider in a workspace that gets utilized; lots of code-copy-pasta): https://github.com/hashicorp/terraform-provider-aws/issues/18311#issuecomment-948381172

piersf commented 2 years ago

Is there any fix to this? We can't trust our terraform code anymore because of this bug basically.

I'm running Terraform version 1.1.4 and AWS Provider version 3.70.0 and still seeing this issue.

jdelStrother commented 2 years ago

I see a lot of people here struggling with tags vs default_tags, so I'm not sure if I'm hitting the same issue, but I seem to be able to reproduce it with something much simpler:

provider "aws" {
  region = "eu-west-2"
  assume_role {
    role_arn = "...."
  }
  default_tags {
    tags = {
      Environment = "Development"
      Provisioner = "Terraform"
    }
  }
}

resource "aws_sqs_queue" "ses_deliveries" {
  name = "ses-deliveries-queue"
}

using terraform 1.1.4 & aws-provider 3.74.0.

On first run, `terraform apply` creates the SQS queue with the 2 default tags. ``` # aws_sqs_queue.ses_deliveries will be created + resource "aws_sqs_queue" "ses_deliveries" { + arn = (known after apply) + content_based_deduplication = false + deduplication_scope = (known after apply) + delay_seconds = 0 + fifo_queue = false + fifo_throughput_limit = (known after apply) + id = (known after apply) + kms_data_key_reuse_period_seconds = (known after apply) + max_message_size = 262144 + message_retention_seconds = 345600 + name = "ses-deliveries-queue" + name_prefix = (known after apply) + policy = (known after apply) + receive_wait_time_seconds = 0 + tags_all = { + "Environment" = "Development" + "Provisioner" = "Terraform" } + url = (known after apply) + visibility_timeout_seconds = 30 } ```
Re-running `terraform apply` shows changes to `tags = {}` ``` Terraform detected the following changes made outside of Terraform since the last "terraform apply": # aws_sqs_queue.ses_deliveries has changed ~ resource "aws_sqs_queue" "ses_deliveries" { id = "https://sqs.eu-west-2.amazonaws.com/xxx/ses-deliveries-queue" name = "ses-deliveries-queue" + tags = {} # (12 unchanged attributes hidden) } ```

Are there any circumstances where default_tags don't result in 'fake' updates?

acdha commented 2 years ago

Are there any circumstances where default_tags don't result in 'fake' updates?

I'm surprised to see a test case as simple as the one you have: in my usage, it's stable across applies as long as you don't have the same tag values present in the resource's tags (overrides are okay, it's just having the exact same value that causes diff noise).

jdelStrother commented 2 years ago

Sounds like the updates I'm seeing are unrelated to default_tags and somewhat expected behaviour - https://discuss.hashicorp.com/t/always-getting-dirty-state-after-a-resource-is-first-created/35185 - sorry for the noise.

piersf commented 2 years ago

@jdelStrother thank you for posting that link. I'm seeing a bit of light at the end of the tunnel now.

What I am seeing in terraform 1.1.4 and AWS Provider 3.70.0 is the following behavior:

I create a VPC, a route table, and a few route table routes. I apply the first time without issues and all resources are created. If I run a plan right after without changing anything at all, the output of the plan will show the following message at the top

Note: Objects have changed outside of Terraform

and then output continues to show that Terraform will attempt to add the routes to the route table (as if they're not already there). Example:

 # module.vpc.aws_route_table.private[2] has changed
  ~ resource "aws_route_table" "private" {
        id               = "rtb-0edkjdfe4a45623534546"
      ~ route            = [
          + {
              + carrier_gateway_id         = ""
              + cidr_block                 = "10.0.0.0/8"
              + destination_prefix_list_id = ""
              + egress_only_gateway_id     = ""
              + gateway_id                 = ""
              + instance_id                = ""
              + ipv6_cidr_block            = ""
              + local_gateway_id           = ""
              + nat_gateway_id             = ""
              + network_interface_id       = ""
              + transit_gateway_id         = "tgw-1234asdfg345gfghh"
              + vpc_endpoint_id            = ""
              + vpc_peering_connection_id  = ""
            },
        ]
        tags             = {
            "ManagedWith"        = "Terraform"
        }
        # (5 unchanged attributes hidden)
    }

But the routes already exist when I check in the VPC Console.

And at the end of the output it will have the below message:

No changes. Your infrastructure matches the configuration.

Your configuration already matches the changes detected above. If you'd like to update the Terraform state to match, create and apply a refresh-only plan:
  terraform apply -refresh-only

When I download the state file and open it, I can see that the route table does not actually have any routes in the state file ( I see an empty list element: "route": [] ) which explains why Terraform thinks it needs to add the routes after the first apply.

Then if I run an apply again (second time), it will eventually show that no resources got modified and give the below message:

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

And the state file is updated this time (with the second apply) and the routes are added to the above-mentioned route list element.

I've been looking at this thing for 3 days now trying to find what is causing it. So I wanted to confirm if this is the behavior you were seeing as well?

Thank you

jdelStrother commented 2 years ago

@piersf I am very much not a terraform expert, but that does seem like the same behaviour I'm seeing, yes.

piersf commented 2 years ago

@jdelStrother thank you! Reading through the link that you posted above, it seems that we just need to wait for the issue with the legacy SDK to be fixed.

2rs2ts commented 2 years ago

@olenm thanks for the explanation and advice. We indeed took a similar approach to the linked comment (https://github.com/hashicorp/terraform-provider-aws/issues/18311#issuecomment-948381172).

Whether or not terraform is supposed to emit a warning, it still seems like a bug for this to be a perma-diff. Seems it should be smart enough to realize that there is nothing changing, so nothing to show as a diff. I am not sure if this happens due to the same fundamental cause as the issues experienced by others in this thread, though.

kenfusion commented 2 years ago

I'm hitting this issue with where I get + tags = {} when I'm using only default_tags at the provider level and no tag attribute at the resource level.

When I apply a -refresh-only plan, the issue goes away.

I haven't tested it with overriding tags at the resource level, but I'm wondering if someone here can give it a a go.

terraform plan -refresh-only -out out.tfplan Judge results terraform apply out.tfplan

Also worth noting is -detailed-exitcode works as expected when I see this 'fake diff' and I'm using that to detect a 'true diff'

Tested with Terraform 1.0.9, AWS Provider 4.6.0 and the aws_iam_role resource.

nicolai86 commented 2 years ago

We're seeing the same issue with vpc peering connections and default tags: running with default_tags terraform recreates the peering routes, stating that the route_table_id is different. removing default_tags from the AWS provider resolves the issue and terraform detects no change.

tomer-ds commented 2 years ago

So we are on: Terraform v1.1.9

Seeing the issue in 3 resource types, namely:

In all cases there are some tags defined at resource level, but never the same tags as defined at provider level. Is this a shorter list of unsupported resource types? Have the others been added? Are there plans to finish it off?

🀞🀞🀞

tculp commented 2 years ago

I use this to prevent a local.tags or var.tags from duplicating keys from default_tags: { for k, v in var.tags: k => v if ! contains(keys(data.aws_default_tags.current.tags), k) }

simnalamburt commented 2 years ago

Is there any workaround for this issue?