aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.66k stars 3.92k forks source link

aws-redshift-alpha: User creation fails when using KMS key created from key ARN #27350

Open SamuelBucheliZ opened 1 year ago

SamuelBucheliZ commented 1 year ago

Describe the bug

When creating a user with the Redshift Alpha module, an encryption key can be provided (see https://docs.aws.amazon.com/cdk/api/v2/docs/aws-redshift-alpha-readme.html#creating-users and https://docs.aws.amazon.com/cdk/api/v2/docs/@aws-cdk_aws-redshift-alpha.User.html ).

We have observed that the behavior differs depending on how the encryption key is created:

Expected Behavior

We would expect the user creation to succeed in the case of using an externally defined key the same way as it does for a key created directly with the initializer / constructor. The type signature of the User initializer / constructor specifies IKey as the expected type, and fromKeyArn / from_key_arn provides an object of type IKey. Therefore, we would expect it to work for any implementation of IKey provided by CDK.

It seems that CDK sets the correct permissions for the key created directly with the initializer / constructor, but does not do so for the imported key?

Current Behavior

It seems that for some reason the required permissions are not set on the KMS key. The creation of the test stack (see reproduction steps) fails with

TestStack | 50/52 | 9:40:03 AM | CREATE_FAILED        | Custom::RedshiftDatabaseQuery               | TestRedshiftUser/Resource/Resource/Default (TestRedshiftUser7D35710B) Received response status [FAILED] from custom resource. Message returned: Access to KMS is not allowed

Logs: /aws/lambda/TestStack-QueryRedshiftDatabase3de5bea727da4796866-Y212ZkpZBkHh 

    at throwDefaultError (/var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:8:22)
    at deserializeAws_json1_1GetSecretValueCommandError (/var/runtime/node_modules/@aws-sdk/client-secrets-manager/dist-cjs/protocols/Aws_json1_1.js:580:51)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /var/runtime/node_modules/@aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:24
    at async /var/runtime/node_modules/@aws-sdk/middleware-signing/dist-cjs/middleware.js:13:20
    at async StandardRetryStrategy.retry (/var/runtime/node_modules/@aws-sdk/middleware-retry/dist-cjs/StandardRetryStrategy.js:51:46)
    at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:6:22
    at async getPasswordFromSecret (/var/task/user.js:60:25)
    at async createUser (/var/task/user.js:36:22)
    at async handler (/var/task/user.js:14:9) (RequestId: f4d049d3-ad7e-49f4-9cb8-c784389b4f50)
TestStack | 50/52 | 9:40:04 AM | CREATE_FAILED        | AWS::CloudFormation::Stack                  | TestStack The following resource(s) failed to create: [TestRedshiftUser7D35710B]. 

Ideally, the CDK code should set the correct permissions on the key (which it seems to do in the case the key is not created from the ARN).

Reproduction Steps

The following test stack demonstrates the issue. Note that for the sake of having a minimal example here, we first create the secret_encryption_key and then use its ARN to create the key_from_arn. Of course, in this example we could directly use the secret_encryption_key (and then everything would work). However, in the real case this is based on, we only have the ARN to work with, i.e., the key will always be created using the from_key_arn method.

from aws_cdk import Stack, RemovalPolicy
from aws_cdk import aws_ec2, aws_iam, aws_kms, aws_redshift_alpha
from constructs import Construct

class TestStack(Stack):
    def __init__(
            self,
            scope: Construct,
            construct_id: str,
            **kwargs,
    ) -> None:
        super().__init__(scope, construct_id, **kwargs)

       # --------- general setup: VPC, security groups ---------
        vpc = aws_ec2.Vpc(
            self,
            'TestVpc',
            ip_addresses=aws_ec2.IpAddresses.cidr("192.168.99.0/16")
        )
        vpc_subnets = vpc.select_subnets(subnet_type=aws_ec2.SubnetType.PUBLIC)

        redshift_security_group = aws_ec2.SecurityGroup(
            self,
            'TestRedshiftSecurityGroup',
            vpc=vpc,
            allow_all_outbound=True,
        )

        # --------- set up KMS keys ---------
        secret_encryption_key = aws_kms.Key(
            self,
            'TestSecretEncryptionKey',
        )

        # in the real example, we don't have direct access to secret_encryption_key, but only to its ARN
        key_from_arn = aws_kms.Key.from_key_arn(self, "TestSecretEncryptionKeyInterface",
                                              secret_encryption_key.key_arn)

        cluster_encryption_key = aws_kms.Key(
            self,
            'TestClusterEncryptionKey',
        )

        # --------- setup Redshift ---------
        cluster_role = aws_iam.Role(
            self,
            'TestClusterRole',
            assumed_by=aws_iam.ServicePrincipal('redshift.amazonaws.com'),
        )

        database_name = 'test-db-name'
        cluster = aws_redshift_alpha.Cluster(
            self,
            'TestClusterIdentifier',
            master_user=aws_redshift_alpha.Login(
                master_username='admin',
                encryption_key=key_from_arn
            ),
            vpc=vpc,
            cluster_type=aws_redshift_alpha.ClusterType.MULTI_NODE,
            default_database_name=database_name,
            encrypted=True,
            encryption_key=cluster_encryption_key,
            enhanced_vpc_routing=True,
            node_type=aws_redshift_alpha.NodeType.RA3_XLPLUS,
            number_of_nodes=2,
            publicly_accessible=False,
            removal_policy=RemovalPolicy.DESTROY,
            roles=[cluster_role],
            security_groups=[redshift_security_group],
            vpc_subnets=aws_ec2.SubnetSelection(
                subnets=vpc_subnets.subnets),
        )

       # --------- create a user - the following part causes the problem  ---------

       # here, we use the key_from_arn, which we created with from_key_arn
       # if we use the secret_encryption_key, the issue does not occur
        test_user = aws_redshift_alpha.User(
           self,
           'TestRedshiftUser',
           encryption_key=key_from_arn, 
           username='test_user',
           cluster=cluster,
           database_name=database_name,
        )

Possible Solution

Unknown. The creation of the user works if secret_encryption_key is used instead of key_from_arn above. However, this is not possible in our setup (the secret_encryption_key is created elsewhere and we only have the ARN available). It is unclear whether there is a problem with the key construct itself or whether there is a problem with the way the key is handled in the Redshift Alpha user construct.

Additional Information/Context

Overall, this is probably somewhat related to discussions around the Liskov substitution principle ( https://en.wikipedia.org/wiki/Liskov_substitution_principle ).

We have observed similar issues with interfaces in CDK previously. For example, a Redshift cluster created from fromClusterAttributes provides an ICluster, but cannot be used equivalently to an actual Cluster, even though most methods only specify ICluster in their types.

As far as we know, there may be some technical reasons for this (e.g., synth time vs. deploy time issues). However, this can make maintaining a larger CDK code base somewhat cumbersome, as it makes relying on typical software engineering best practices (e.g., the usage of interfaces) somewhat tricky. Therefore, in cases where this is not avoidable, it may be desirable adjust the types used in the various methods (e.g., if only a specific implementation works, the method should only accept this specific implementation) or to at least further clarify this in the documentation.

CDK CLI Version

2.95.0 (build cfa7e88)

Framework Version

2.96.2

Node.js Version

v18.17.1

OS

various (locally tested on Windows and MacOS, in CI/CD pipeline with LInux images)

Language

Python

Language Version

3.10.13

Other information

No response

indrora commented 1 year ago

This is the result of the CDK role not having the ability to update KMS keys like this. It's correct: the lambda that is being created probably doesn't have the right permission to update the key. It also probably shouldn't be using an imported key?

potu-srikanth commented 1 year ago

Further to what @SamuelBucheliZ mentioned, we also see issues/similar behaviour when "ICluster" ( created from "from_cluster_attributes") used for User creation( aws_redshift_alpha.User) instead of the "Cluster".