hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.85k stars 9.2k forks source link

[Bug]: AWS KMS policy not updating properly #39788

Open flyingbeefhead opened 1 month ago

flyingbeefhead commented 1 month ago

Terraform Core Version

1.9.3

AWS Provider Version

5.72.1

Affected Resource(s)

aws_kms_key

Expected Behavior

aws_kms_key policy should update properly and/or not force update if no changes are made

Actual Behavior

plan forces update of kms policy even is no changes. policy is pushed to AWS but validation of policy fails due to Principal and/or Action lists being returned in different order because AWS returns it out of original order. Validation fails.

apply fails with:

Error: waiting for KMS Key (xxxxx) policy update: timeout while waiting for state to become 'TRUE' (last state: 'FALSE', timeout: 10m0s)

Relevant Error/Panic Output Snippet

No response

Terraform Configuration Files

Unable to provide code due to security issues.

Steps to Reproduce

run plan and apply

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

No

github-actions[bot] commented 1 month ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

justinretzolk commented 1 month ago

Hey @flyingbeefhead 👋 Thank you for taking the time to raise this! Without a sample configuration that can be used to reproduce this or any sort of logging, it would be quite difficult for us to look into this. I definitely recognize and can relate to organization restrictions of providing specific configurations, but are you able to create a more agnostic sample configuration that you're able to share, or provide logging by encrypting it with our GPG key?

flyingbeefhead commented 1 month ago

I can provide some debugging, but i would have to strip out any SBU/CUI information. Please let me know what options to add to the terraform plan and/or apply commands to gather the logs you need/want?

Sent from AOL on Android

On Fri, Oct 18, 2024 at 17:41, Justin @.***> wrote:

Hey @flyingbeefhead 👋 Thank you for taking the time to raise this! Without a sample configuration that can be used to reproduce this or any sort of logging, it would be quite difficult for us to look into this. I definitely recognize and can relate to organization restrictions of providing specific configurations, but are you able to create a more agnostic sample configuration that you're able to share, or provide logging by encrypting it with our GPG key?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

flyingbeefhead commented 1 month ago

Here are the plan and apply logs that have been stripped down to just related to the affected key and I have obfuscated any SBU information. the substitutions have been consistent between the two files to allow comparison.

kms plan.txt kms apply.txt

flyingbeefhead commented 1 month ago

I have looked at the GO code and I believe the issue is the failure of Line 723 in https://github.com/hashicorp/terraform-provider-aws/blob/main/internal/service/kms/key.go. the awspolicy.PolicyAreEquivalent() is not seeing the value passed to AWS as part of the apply as being the same when it is pulled back out to validate that the change has been made even though the change was made and made properly, as shown by looking at the policy through the AWS console.

The issue appears to be that the order of the Principal and/or Action lists is not the same once it is submitted to AWS and pulled out to validate. The order of the items appears to be changed by AWS and is inconsistent. I attempted to change the order of the items to match what I pull out of AWS console, but then AWS changes the order each time.

the code needs to be changed to look at the lists to make sure the components of the lists match, but not be dependent on the ORDER of the components.

flyingbeefhead commented 1 month ago

I have attempted to run the same pipeline on multiple versions of the AWS provider in the 5.x series, 4.x series, and 3.76.1. There is no difference. this issue appears to be focused on the comparison of the policy pulled from AWS and the policy created in the plan file, even after taking the policy in the plan file and pushing it to AWS and then pulling it to validate it has been updated.

flyingbeefhead commented 3 weeks ago

What is the status of this ticket being looked at? Is there additional information you need? if you can point to where the awspolicy.PolicyAreEquivalent() is defined, I would be happy to help look at it since it has been 2 weeks. We need to get this working.

thomasmuders-imtf commented 3 days ago

Hi all,

EDIT: I found the issue. The provider doesn't notice if you have duplicate principal entries in the terraform code and then it will never finish. Maybe the original poster has the same issue? AWS will just de-duplicate them and then it probably counts them differently than Terraform.

I came across this issue as I am facing the problem as well. Apparently it is triggered by a certain amount of principals in the policy as it was originally working and broke only later after I added more principals in the code. To me also it seems that the code expects the same order but that it clearly not preserved by AWS. You can see this easily in the AWS console, you update the policy and save it and get the principals in a random order. Just adding this to indicate that the original poster is not alone with experiencing this issue. AWS provider version 5.75.1.

flyingbeefhead commented 3 days ago

I have done a lot more debugging on this and also found that AWS changes some of the ARNs that are submitted.  I had a number of STS ARNs in my Principals lists and AWS changed one of them to an IAM ARN when it was returned to be validated. No idea why AWS only changed one of them but left the rest. This caused failures for me as well. The only thing that was changed was sts to iam in then ARN.

Sent from AOL on Android

On Thu, Nov 21, 2024 at 7:19, Thomas @.***> wrote:

Hi all, I came across this issue as I am facing the problem as well. Apparently it is triggered by a certain amount of principals in the policy as it was originally working and broke only later after I added more principals in the code. To me also it seems that the code expects the same order but that it clearly not preserved by AWS. You can see this easily in the AWS console, you update the policy and save it and get the principals in a random order. Just adding this to indicate that the original poster is not alone with experiencing this issue. AWS provider version 5.75.1.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>