hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.71k stars 9.07k forks source link

[Bug]: aws_kms_key_policy resource creation times out during terraform apply #30232

Open micchickenburger opened 1 year ago

micchickenburger commented 1 year ago

Terraform Core Version

1.4.2

AWS Provider Version

4.59.0

Affected Resource(s)

aws_kms_key_policy

Expected Behavior

The resource should be created.

Actual Behavior

The resource does get created successfully, but terraform apply times out. Furthermore, running apply again determines that the policy still needs to be created even though I can see the policy correctly specified in the Key Management Service web console.

Relevant Error/Panic Output Snippet

╷
│ Error: attaching KMS Key policy (c34de4f8-f679-41e6-b610-ba8a57726bc5): updating policy: waiting for completion: timeout while waiting for state to become 'TRUE' (last state: 'FALSE', timeout: 10m0s)
│ 
│   with aws_kms_key_policy.ami,
│   on kms.tf line 21, in resource "aws_kms_key_policy" "ami":
│   21: resource "aws_kms_key_policy" "ami" {
│

Terraform Configuration Files

resource "aws_kms_key" "ami" {
  description = "ami-cmk"

  multi_region            = false
  enable_key_rotation     = true
  deletion_window_in_days = 30

  key_usage                = "ENCRYPT_DECRYPT"
  customer_master_key_spec = "SYMMETRIC_DEFAULT"

  tags = {
    Name = "ami-cmk"
  }
}

resource "aws_kms_alias" "ami" {
  name          = "alias/ami-cmk"
  target_key_id = aws_kms_key.ami.key_id
}

# Get current account ID
data "aws_caller_identity" "current" {}

resource "aws_kms_key_policy" "ami" {
  key_id = aws_kms_key.ami.key_id
  policy = jsonencode({
    Id = "ami-cmk-policy"
    Statement = [
      {
        Sid    = "Enable IAM User Permissions"
        Effect = "Allow"
        Principal = {
          AWS = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"] # current account ID
        },
        Action   = ["kms:*"],
        Resource = "*"
      },
      {
        Sid    = "Allow another AWS account to use this KMS key"
        Effect = "Allow"
        Principal = {
          AWS = ["arn:aws:iam::${var.other_account}:root"] # other account ID
        },
        Action = [
          "kms:DescribeKey",
          "kms:ReEncrypt*",
          "kms:CreateGrant",
          "kms:Decrypt",
        ],
        Resource = "*"
      }
    ]
  })
}

Steps to Reproduce

terraform apply

Debug Output

No response

Panic Output

No response

Important Factoids

Runs in the latest terraform docker container in Gitlab CI.

References

No response

Would you like to implement a fix?

None

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

elixxx commented 1 year ago

Hey, i imported the policy by hand and and got the diff which terraform struggeling. You have to Remove the List "[]" around the AWS on Element list and add a Version in the button. A workaround for sure, but the following example works for me:

resource "aws_kms_key_policy" "this" {
  key_id = aws_kms_key.this.key_id
  policy = jsonencode({
    Id = "ami-cmk-policy"
    Statement = [
      {
        Sid    = "Enable IAM User Permissions"
        Effect = "Allow"
        Principal = {
          AWS = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root" # current account ID
        },
        Action   = "kms:*",
        Resource = "*"
      },
      {
        Sid    = "Allow another AWS account to use this KMS key"
        Effect = "Allow"
        Principal = {
          AWS = "arn:aws:iam::${var.cross_account_grant_account_id}:root" # other account ID
        },
        Action = [
          "kms:Encrypt",
          "kms:Decrypt",
          "kms:ReEncrypt*",
          "kms:GenerateDataKey*",
          "kms:DescribeKey"
        ],
        Resource = "*"
      }
    ]
    Version ="2008-10-17"
  })
}
stromnet commented 1 year ago

The above workaround is not a workaround if you actually need multiple principals. I've got a Terraform job which never manages to finish without error (updating KMS Key (7aa315d1-xxxxxxxxxxx): updating policy: waiting for completion: timeout while waiting for state to become 'TRUE' (last state: 'FALSE', timeout: 10m0s)) and tries to re-modify every time.

aws provider 5.10.0 and terraform 1.3.7

The planned change is:

  # aws_kms_key.cicd_secrets will be updated in-place
  ~ resource "aws_kms_key" "cicd_secrets" {
        id                                 = "7aa315d1-xxxxxxxxxxxxxxxxx"
      ~ policy                             = jsonencode(
          ~ {
              ~ Statement = [
                    {
                        Action    = "kms:*"
                        Effect    = "Allow"
                        Principal = {
                            AWS = "arn:aws:iam::zzzzzzzzzzz:root"
                        }
                        Resource  = "*"
                        Sid       = "ManageSelf"
                    },
                  ~ {
                      ~ Principal = {
                          ~ AWS = [
                                "arn:aws:iam::1237xxxx:root",
                              - "arn:aws:iam::6753xxxx:root",
                              + "arn:aws:iam::6617xxxx:root",
                                "arn:aws:iam::6294xxxx:root",
                              - "arn:aws:iam::2405xxxx:root",
                              - "arn:aws:iam::0234xxxx:root",
                                "arn:aws:iam::2972xxxx:root",
                              - "arn:aws:iam::6617xxxx:root",
                              + "arn:aws:iam::2405xxxx:root",
                              + "arn:aws:iam::6753xxxx:root",
                                "arn:aws:iam::2258xxxx:root",
                                "arn:aws:iam::6433xxxx:root",
                              + "arn:aws:iam::0234xxxx:root",
                              + "arn:aws:iam::0234xxxx:root",
                            ]
                        }
                        # (5 unchanged elements hidden)
                    },
                ]
                # (1 unchanged element hidden)
            }
        )
        tags                               = {}
        # (10 unchanged attributes hidden)
    }

Since the list of principals is generated from another list I cannot hard-code a particular order, and the order is seemingly random so cannot really sort with something either.

The "always attempting to change order" part of the problem is the same with aws_secretsmanager_secret policy and it's principals, but that one finishes modifications within 1 second at least and does not block execution.

mkwtsec commented 3 months ago

encountering this on AWS provider 5.47 with terraform 1.7.5 as well.

jmpelaschier commented 2 months ago

I was facing this issue as well for a long time and after going through the aws provider code, I finally figured out what went wrong between what terraform is waiting for and what has been created by AWS.

In @elixxx 's case, he has a list of principals in his Terraform plan that gets converted into just a string by AWS. Terraform then compares the plan with what AWS is returning and since the policies do not match up (by its standards), it times out.

In @stromnet 's case, which is also what happened to me, there is a role that is being added twice in the principal list. You can tell this because there are more +s than -s which is very hard to catch. So in this case, terraform actually notices a difference which is that there is an aditional role to add. However, AWS automatically removes duplicate roles when creating a policy. So once the policy is updated in AWS, terraform also times out while waiting for the policy to be propagated since the one in AWS does not have a duplicate role but the one in the terraform plan does.

In my case, I removed the duplicate role from my principal list and I stopped getting this error. The code that the AWS providers uses to compare policies can be found here which is how I figured this out: https://github.com/hashicorp/awspolicyequivalence/blob/main/aws_policy_equivalence.go#L255

w564791 commented 1 week ago
Screenshot 2024-08-21 at 22 08 16

i had the same issue. even tfe or community terraform

w564791 commented 1 week ago

resource "aws_kms_key_policy" "tfc-agent" { key_id = aws_kms_key.tfc-agent.id policy = jsonencode({ Version = "2012-10-17" Id = "tfc-agent" Statement = [ { Sid = "tfcAgent" Effect = "Allow" Principal = { AWS = [concat(tolist(var.tokens[].iam_role), ["arn:aws:iam::xxx:root"])] // if there use variables .it must return timeout , even terraform version 1.5x -> 1.9x ,or aws provider version 4.x->5.x }, Action = "kms:" Resource = "*" } ] }) }