Terraform shows diff for policy due to different order in JSON

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

Terraform v1.0.8
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.70.0
+ provider registry.terraform.io/hashicorp/google v3.90.1

Affected Resource(s)

aws_s3_bucket_policy

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

locals {
  customers = data.terraform_remote_state.global.outputs.all_customers
}

resource "aws_s3_bucket" "customer_cloud_resources" {
  for_each = local.customers
  bucket   = "XXX-cloud-resources-${each.value}"
  acl      = "private"

  cors_rule {
    allowed_headers = ["*"]
    allowed_methods = ["GET", "HEAD"]
    allowed_origins = ["*"]
    expose_headers  = ["ETag"]
    max_age_seconds = 86400
  }

  tags = {
    Customer: each.value
  }
}

resource "aws_s3_bucket_ownership_controls" "customer_cloud_resources" {
  for_each = aws_s3_bucket.customer_cloud_resources
  bucket = each.value.id

  rule {
    object_ownership = "BucketOwnerPreferred"
  }

  depends_on = [
    aws_s3_bucket_policy.customer_cloud_resources
  ]
}

data "aws_iam_policy_document" "s3_public_html_access_policy" {
  for_each = local.customers
  statement {
    effect = "Allow"
    actions = [
      "s3:GetObject"
    ]
    resources = [
      "arn:aws:s3:::${aws_s3_bucket.customer_cloud_resources[each.value].bucket}/${local.public_html_path}/*"
    ]
    principals {
      type = "AWS"
      identifiers = [aws_cloudfront_origin_access_identity.customer_pool_s3_distribution_origin_access_identity[each.value].iam_arn]
    }
  }
  statement {
    effect = "Allow"
    actions = [
      "s3:ListBucket"
    ]
    resources = [
      "arn:aws:s3:::${aws_s3_bucket.customer_cloud_resources[each.value].bucket}"
    ]
    condition {
      test = "StringLike"
      variable = "s3:prefix"
      values = [ "${local.public_html_path}/*" ]
    }
    principals {
      type = "AWS"
      identifiers = [aws_cloudfront_origin_access_identity.customer_pool_s3_distribution_origin_access_identity[each.value].iam_arn]
    }
  }
}

data "aws_iam_policy_document" "customer_cssu_policy" {
  for_each = var.bucket_policy_allowed_accounts

  statement {
    effect = "Allow"
    actions = [
      "s3:List*"
    ]
    resources = [
      "arn:aws:s3:::XXX-cloud-resources-${each.key}"
    ]

    condition {
      test = "StringLike"
      variable = "s3:prefix"

      values = [
        "public_html/cssu/*"
      ]
    }

    dynamic "principals" {
      for_each = each.value
      content {
        type = principals.value.type
        identifiers = principals.value.identifiers
      }
    }
  }

  statement {
    effect = "Allow"
    actions = [
      "s3:Get*",
      "s3:List*",
      "s3:Put*",
      "s3:Delete*"
    ]
    resources = [
      "arn:aws:s3:::XXX-cloud-resources-${each.key}/public_html/cssu/*"
    ]

    dynamic "principals" {
      for_each = each.value
      content {
        type = principals.value.type
        identifiers = principals.value.identifiers
      }
    }
  }
}

data "aws_iam_policy_document" "customer_projects_policy" {
  for_each = var.project_folder_policy_allowed_accounts

  statement {
    effect = "Allow"
    actions = [
      "s3:List*"
    ]
    resources = [
      "arn:aws:s3:::XXX-cloud-resources-${each.key}"
    ]

    condition {
      test = "StringLike"
      variable = "s3:prefix"

      values = [
        "public_html/projects/*"
      ]
    }

    dynamic "principals" {
      for_each = each.value
      content {
        type = principals.value.type
        identifiers = principals.value.identifiers
      }
    }
  }

  statement {
    effect = "Allow"
      actions = [
        "s3:Get*",
        "s3:List*",
        "s3:Put*",
        "s3:Delete*"
      ]
      resources = [
        "arn:aws:s3:::XXX-cloud-resources-${each.key}/public_html/projects/*"
      ]

      dynamic "principals" {
        for_each = each.value
        content {
          type = principals.value.type
          identifiers = principals.value.identifiers
        }
      }
    }
}

data "aws_iam_policy_document" "customer_uploads_policy" {
  for_each = var.mappings_folder_policy_allowed_accounts

  statement {
    effect = "Allow"
    actions = [
      "s3:List*"
    ]
    resources = [
      "arn:aws:s3:::XXX-cloud-resources-${each.key}"
    ]

    condition {
      test = "StringLike"
      variable = "s3:prefix"

      values = [
        "uploads/*"
      ]
    }

    dynamic "principals" {
      for_each = each.value
      content {
        type = principals.value.type
        identifiers = principals.value.identifiers
      }
    }
  }

  statement {
    effect = "Allow"
    actions = [
      "s3:Get*",
      "s3:List*",
      "s3:Put*"
    ]
    resources = [
      "arn:aws:s3:::XXX-cloud-resources-${each.key}/uploads/*"
    ]

    dynamic "principals" {
      for_each = each.value
      content {
        type = principals.value.type
        identifiers = principals.value.identifiers
      }
    }
  }
}

locals {
  empty_aws_iam_policy_document = {"json": jsonencode({"Statement": []})}
}

resource "aws_s3_bucket_policy" "customer_cloud_resources" {
  for_each = local.customers
  bucket = aws_s3_bucket.customer_cloud_resources[each.value].id

  policy = jsonencode({
    "Statement": concat(
    jsondecode(data.aws_iam_policy_document.s3_public_html_access_policy[each.value].json)["Statement"],
    jsondecode(lookup(data.aws_iam_policy_document.customer_cssu_policy, each.value, local.empty_aws_iam_policy_document).json)["Statement"],
    jsondecode(lookup(data.aws_iam_policy_document.customer_projects_policy, each.value, local.empty_aws_iam_policy_document).json)["Statement"],
    jsondecode(lookup(data.aws_iam_policy_document.customer_uploads_policy, each.value, local.empty_aws_iam_policy_document).json)["Statement"]
    )
    "Version": "2012-10-17"
  })
}

Debug Output

https://gist.github.com/anthonyAdhese/861b3bba3dfe2f6e55b4da676af7600f

Panic Output

Expected Behavior

There shouldn't be any diff due to a difference in json order that is received from AWS.

Actual Behavior

For each policy we now get a diff that is probably bogus but isn't visible in the output when running Terraform plan.

Steps to Reproduce

terraform plan

Important Factoids

References

0000

Also an issue for aws_iam_openid_connect_provider.

The client_id_list in AWS, the "audiences", doesn't reflect what is in the code.

More details to come.

this has been driving me crazy lately, if it's the same thing - smells like the same thing.

I have an S3 bucket policy doing the same thing.
provider info:

2023-11-06T14:07:01.685-0800 [DEBUG] provider: starting plugin: path=.terraform/providers/registry.terraform.io/hashicorp/aws/5.23.1/darwin_arm64/ter
raform-provider-aws_v5.23.1_x5 args=[".terraform/providers/registry.terraform.io/hashicorp/aws/5.23.1/darwin_arm64/terraform-provider-aws_v5.23.1_x5"
]

When I execute terraform state show the resources show up in the same order as they do in the AWS console. In the output, the resources in the jsonencode block are

            Statement = [
                {
                    Action    = "s3:*"
                    Condition = {
                        Bool = {
                            "aws:SecureTransport" = "false"
                        }
                    }
                    Effect    = "Deny"
                    Principal = "*"
                    Resource  = [
                        "arn:aws:s3:::nomic-bastion-terraform-state/*",
                        "arn:aws:s3:::nomic-bastion-terraform-state",
                    ]
                    Sid       = "AllowSSLRequestsOnly"
                },

However, when I execute terraform apply, I see the following order for the Resource block, followed by removal of the existing policy. Is the plan somehow not passing the correct structures to deepEquals ?

 # module.backend_setup.data.aws_iam_policy_document.state_force_ssl will be read during apply
  # (depends on a resource or a module with changes pending)
 <= data "aws_iam_policy_document" "state_force_ssl" {
      + id   = (known after apply)
      + json = (known after apply)

      + statement {
          + actions   = [
              + "s3:*",
            ]
          + effect    = "Deny"
          + resources = [
              + "arn:aws:s3:::nomic-bastion-terraform-state",
              + "arn:aws:s3:::nomic-bastion-terraform-state/*",
            ]
          + sid       = "AllowSSLRequestsOnly"

          + condition {
              + test     = "Bool"
              + values   = [
                  + "false",
                ]
              + variable = "aws:SecureTransport"
            }

          + principals {
              + identifiers = [
                  + "*",
                ]
              + type        = "*"
            }
        }

When I pull the policy from a hardcoded file, it works as expected, i.e. policy = file("${path.module}/state_bucket_policy.json")

Also, from my perspective, this is not an "enhancement" - it is a bug, plain and simple. The ordering of resources is owned by the AWS API, and terraform should not cause users trouble (like a day of debugging time) by changing that order in any manner.

Also, from my perspective, this is not an "enhancement" - it is a bug, plain and simple. The ordering of resources is owned by the AWS API, and terraform should not cause users trouble (like a day of debugging time) by changing that order in any manner.

I have what I think is a slightly different view of the issue, and a proposed solution. Because the AWS API has control over whether and how it chooses to transform an IAM policy both when it is set as well as when it is read (i.e. for state refresh), there are only a couple of options for producing a clean diff/avoiding spurious updates of otherwise semantically equivalent policies.

The provider can attempt to transform the declared desired state to a canonical form that attempts to match what AWS will report when describing the current state of the resource. This assumes that there even is a canonical form, and even in that case is the worse of the two options in since AWS can change its canonical representation of the resource at any time. In the case of IAM policies I think this is not even an option... for example it seems when you specify a list of principals by ARN, that list doesn't necessarily stay in the order specified... I suspect AWS APIs may return these ordered lexicographically by the principal unique identifier (that is, ID not ARN). In a case like that, the provider simply will not have the information necessary in the declared desired state to mimic AWS' representation.
(proposed) The provider can transform BOTH the declared desired state AND the what's read from AWS to a "provider-internal" canonical form before incorporating them into its internal resource model for diffing. You can see that halfway happening with aws_iam_policy_document already, as it appears to sort AWS principal ARNs lexicographically (but reversed :roll_eyes:) when representing the declared desired state, what appears to be missing is a similar transformation when modeling the actual resource state read from AWS.

2. (proposed) The provider can transform BOTH the declared desired state AND the what's read from AWS to a "provider-internal" canonical form before incorporating them into its internal resource model for diffing. You can see that halfway happening with aws_iam_policy_document already, as it appears to sort AWS principal ARNs lexicographically (but reversed 🙄) when representing the declared desired state, what appears to be missing is a similar transformation when modeling the actual resource state read from AWS.

That makes good sense - insulating the terraform state from changes to API return ordering should solve this problem.

hashicorp / terraform-provider-aws