aws-solutions / aws-control-tower-customizations

The Customizations for AWS Control Tower solution combines AWS Control Tower and other highly-available, trusted AWS services to help customers more quickly set up a secure, multi-account AWS environment using AWS best practices.
https://docs.aws.amazon.com/controltower/latest/userguide/cfct-overview.html
Apache License 2.0
354 stars 205 forks source link

v2.5.2 - destroyed KMS key policy #153

Closed mbonig closed 1 year ago

mbonig commented 1 year ago

Describe the bug Due to the codebuild deprecation warnings I updated the CTC stack on March 8th to v2.5.2. After this, all of our Control Tower CloudTrail logs stopped writing to the bucket. Reviewing the KMS key, I can see a change to it during the deploy in which the CTC Custom Resource updated the policy, overwriting the existing one which allowed access from CT and other systems to use the key.

The CloudTrail record (abbr. and redacted):

"requestParameters": {
        "keyId": "5a0fc8fb-38db-4a5b-914b-redacted",
        "policyName": "default",
        "policy": "{\"Version\": \"2012-10-17\", \"Statement\": [{\"Action\": [\"kms:Create*\", \"kms:Describe*\", \"kms:Enable*\", \"kms:List*\", \"kms:Put*\", \"kms:Update*\", \"kms:Revoke*\", \"kms:Disable*\", \"kms:Get*\", \"kms:Delete*\", \"kms:ScheduleKeyDeletion\", \"kms:CancelKeyDeletion\"], \"Resource\": \"*\", \"Effect\": \"Allow\", \"Principal\": {\"AWS\": \"arn:aws:iam::redacted:root\"}, \"Sid\": \"Allow administration of the key\"}, {\"Action\": [\"kms:Encrypt\", \"kms:Decrypt\", \"kms:ReEncrypt*\", \"kms:GenerateDataKey*\", \"kms:DescribeKey\"], \"Resource\": \"*\", \"Effect\": \"Allow\", \"Principal\": {\"Service\": [\"events.amazonaws.com\"], \"AWS\": [\"arn:aws:iam::redacted:role/CustomControlTowerStateMachineLambdaRole\", \"arn:aws:iam::redacted:role/CustomControlTowerDeploymentLambdaRole\", \"arn:aws:iam::redacted:role/CustomControlTowerCodePipelineRole\", \"arn:aws:iam::redacted:role/control-tower-customizati-CustomControlTowerCodeBu-F5AEY217LUM1\", \"arn:aws:iam::redacted:role/control-tower-customizations-SCPCodeBuildRole-XMZ81ER4L3VF\", \"arn:aws:iam::redacted:role/control-tower-customizations-StackSetCodeBuildRole-1BLP2MJ5ROX7A\", \"arn:aws:iam::redacted:role/CustomControlTowerLELambdaRole\"]}, \"Sid\": \"Allow use of the key\"}], \"Id\": \"key-CustomControlTower-1\"}",
        "bypassPolicyLockoutSafetyCheck": true
    },

Notice here that the only principals with access are events and CTC ARNs. Removing the other principals and statements broke multiple systems.

To restore our systems I first tried AWS Config to show previous configuration versions, but the recorder wasn't enable for our root account. So instead I had to go to CloudTrail and look for the previous PutKeyPolicy event on the key and re-apply that policy to get things restored.

To Reproduce Our previous version of CTC's stack was deployed on 2022-04-28, and was likely version 2.2.0 or 2.3.0. We had updated CT to version 3.0 after this deploy and before the upgrade of CTC to 2.5.2. We deployed version 2.5.2 which changed the KMS key policy and broke the organization trail from writing to the bucket as well as some outside integrations which read files from that bucket.

Expected behavior I expected the KMS key policy to remain intact and to account for changes and other principals and services that needed access to it rather than completely

Please complete the following information about the solution:

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0089) - customizations-for-aws-control-tower Solution. Version: v1.0.0". You can also find the version from releases

Screenshots If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context Add any other context about the problem here.

adam-daily commented 1 year ago

Hey Matthew, thanks for reaching out to us about this. Just a couple questions for you so I can make sure I understand what's going on here:

  1. Did you add anything directly to the KMS key policy at any point before doing this upgrade? When the CloudFormation stack that represents CfCT updates, it will impose the KMS key policy details that are written in the CfCT template. Any changes under the hood (not through the CloudFormation template) will be overwritten. That's a behavior of CloudFormation rather than of CfCT specifically.

  2. Can you confirm that there is an event in the CloudFormation stack that holds CfCT that shows an update to the resource called CustomControlTowerConfigDeployer?

mbonig commented 1 year ago

Hey Matthew, thanks for reaching out to us about this. Just a couple questions for you so I can make sure I understand what's going on here:

  1. Did you add anything directly to the KMS key policy at any point before doing this upgrade? When the CloudFormation stack that represents CfCT updates, it will impose the KMS key policy details that are written in the CfCT template. Any changes under the hood (not through the CloudFormation template) will be overwritten. That's a behavior of CloudFormation rather than of CfCT specifically.

Yes, there were other statements in the policy, but they weren't just additional statements we added, but also statements required for Control Tower/CloudTrail to work. I can send the entire policy that was in place beforehand if that helps.

  1. Can you confirm that there is an event in the CloudFormation stack that holds CfCT that shows an update to the resource called CustomControlTowerConfigDeployer?

Yes, this is how I originally discovered that it was CfCT that destroyed the key policy. (I actually backtracked it from a CloudTrail event to the lambda to the stack's CR.

adam-daily commented 1 year ago

Hey Matthew, appreciate you getting back to me. Given those factors, I think AWS Premium Support will be better equipped to help you out here. That team has better insight into your account and can see details about your KMS key policies and CloudFormation history that I'm not able to see from my side (and shouldn't view for security reasons). When you reach out to them, feel free to link this issue/correspondence to get them up to speed about what you're experiencing.

mbonig commented 1 year ago

Hey Matthew, appreciate you getting back to me. Given those factors, I think AWS Premium Support will be better equipped to help you out here. That team has better insight into your account and can see details about your KMS key policies and CloudFormation history that I'm not able to see from my side (and shouldn't view for security reasons). When you reach out to them, feel free to link this issue/correspondence to get them up to speed about what you're experiencing.

What would you like to see? I have been able to review the logs of both my CFN stack and the lambda function that backs the CR in the stack. I can see in CloudTrail logs that doing the update to v2.5.2 of CfCT destroyed the existing KMS key policy.

After reviewing the configuration I believe this problem occurred because our Control Tower set up is using, improperly, the CfCT KMS Key for it's overall encryption, and it shouldn't be.

Closing this ticket. Thanks for the assistance.