aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.33k stars 3.76k forks source link

core: permissions boundary not being applied to custom resource role #30179

Open adamtimmins opened 1 month ago

adamtimmins commented 1 month ago

Describe the bug

I'm deploying a stack through CDK pipelines and have a permissions boundary configured within cdk.json. Every role is being configured app wide with the permissions boundary apart from one which seems to be created by CDK itself for my AwsCustomResource.

Expected Behavior

I expect the permissions boundary to be applied the CDK application app wide and not miss any roles deployed by the CDK application

Current Behavior

The CDK created role is failing to add the permissions boundary to the application.

Reproduction Steps

CDK pipelines is being deployed using bootstrapped roles with a custom qualifier and where the permission boundary is required.

synth_object = cdk.DefaultStackSynthesizer(
    qualifier=config["cdk_synth_qualifier"],
)

AwsCustomResource

response = AwsCustomResource(
    self,
    "describe-enis",
    on_update={
        "service": "EC2",
        "action": "describeNetworkInterfaces",
        "output_paths": output_paths,
        "parameters": {"NetworkInterfaceIds": eni_ids},
        "physical_resource_id": PhysicalResourceId.of(str(random.random())),
    },
    policy=AwsCustomResourcePolicy.from_statements(
        statements=[
            iam.PolicyStatement(
                effect=iam.Effect.ALLOW,
                actions=["ec2:DescribeNetworkInterfaces"],
                resources=["*"],
            ),
        ],
    ),
)

The role in question not being provided the permission boundary AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867

"AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867": {
   "Type": "AWS::IAM::Role",
   "Properties": {
    "AssumeRolePolicyDocument": {
     "Version": "2012-10-17",
     "Statement": [
      {
       "Action": "sts:AssumeRole",
       "Effect": "Allow",
       "Principal": {
        "Service": "lambda.amazonaws.com"
       }
      }
     ]
    },
    "ManagedPolicyArns": [
     {
      "Fn::Sub": "arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
     }
    ]
   },

The cdk.json config

"@aws-cdk/core:permissionsBoundary": {
      "name": "cdk-permissions-boundary-policy"
    }

Possible Solution

No response

Additional Information/Context

I have tried adding the permission boundary to the stack itself, as well as the custom resource itself following the documentation here: https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_iam/README.html#permissions-boundaries

I have also tried creating a custom aspect to add the stack and the stage but none of these worked either: https://github.com/aws/aws-cdk/issues/3242#issuecomment-561064190

CDK CLI Version

2.141.0

Framework Version

No response

Node.js Version

v22.1.0

OS

Sonoma 14.2.1

Language

Python

Language Version

3.12.3

Other information

No response

khushail commented 1 month ago

Hi @adamtimmins , thanks for reaching out. It seems like what you are mentioning here is quite similar to the bug described here and this reasoning and further explanation might be helpful to understand why.

Please feel free to reach out if its not helpful or different than what you are implying.

github-actions[bot] commented 1 month ago

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

dliu864 commented 1 month ago

Hi,

Could we please have this bug fixed? Our organization is requiring that we have permission boundaries implemented on all Roles that we create and there is currently no way to add a boundary to AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867 as mentioned above.

khushail commented 1 month ago

@adamtimmins , Could you please share the the complete repro code.

I also see there is a closed issue similar to the custom role mentioned above - https://github.com/aws/aws-cdk/issues/22972 and many more (https://github.com/aws/aws-cdk/issues/13310) and previous attempts have been made for such similar issues like this PR - https://github.com/aws/aws-cdk/pull/14754. However this still seems like an issue so I am marking this as P1 for the appropriate traction.

dliu864 commented 1 month ago

Hi Khurana,

I have added you to my repo so you can reproduce the bug. Please let me know how it goes.

Cheers,

David

On Wed, May 29, 2024 at 4:39 AM Shailja Khurana @.***> wrote:

@adamtimmins https://github.com/adamtimmins , Could you please share the the complete repro code.

I also see there is a closed issue similar to the custom role mentioned above - #22972 https://github.com/aws/aws-cdk/issues/22972 and previous attempts have been made for such similar issues like this PR - #14754 https://github.com/aws/aws-cdk/pull/14754. However this still seems like an issue so I am marking this as P1 for the appropriate traction.

— Reply to this email directly, view it on GitHub https://github.com/aws/aws-cdk/issues/30179#issuecomment-2135888177, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2L3GKCI2FASVSH5J4S2CDZETFNBAVCNFSM6AAAAABHUCKCV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZVHA4DQMJXG4 . You are receiving this because you commented.Message ID: @.***>

adamtimmins commented 1 month ago

Hi @adamtimmins , thanks for reaching out. It seems like what you are mentioning here is quite similar to the bug described here and this reasoning and further explanation might be helpful to understand why.

Please feel free to reach out if its not helpful or different than what you are implying.

I have attempted these workarounds but none of them seem to work.

I have raised the issue and shared the code with AWS premium support and when I get a response I'll share it here.

mjvirt commented 1 month ago

It is also possible to patch using the Aspect as follows given that the type of object for the role is a CfnResource and it has a node path (example using Custom Resource to delete S3 bucket objects):

@jsii.implements(IAspect)
class IamPathFixer:

    def visit(self, node) -> None:
       if isinstance(node, iam.CfnRole):
           node.add_property_override("Path", "/approles/")
       elif isinstance(node, CfnResource):
            if "Custom::S3AutoDeleteObjectsCustomResourceProvider/Role" in node.node.path:
                node.add_property_override("Path", "/approles/")
adamtimmins commented 1 month ago

It is also possible to patch using the Aspect as follows given that the type of object for the role is a CfnResource and it has a node path (example using Custom Resource to delete S3 bucket objects):

@jsii.implements(IAspect)
class IamPathFixer:

    def visit(self, node) -> None:
       if isinstance(node, iam.CfnRole):
           node.add_property_override("Path", "/approles/")
       elif isinstance(node, CfnResource):
            if "Custom::S3AutoDeleteObjectsCustomResourceProvider/Role" in node.node.path:
                node.add_property_override("Path", "/approles/")

Have you tried using this solution when deploying the stack via CDK Pipelines. I've tried using a custom aspect before but it didn't work.

Also can you explain this line please add_property_override("Path", "/approles/") ?

mjvirt commented 1 month ago

@adamtimmins actually above was just an example of getting to the resource you want to "patch" with an Aspect (in this case I assign an IAM path to isolate the "app" workload from other workloads). But you could equally do this as well (as an example)...

        elif isinstance(node, CfnResource) and node.cfn_resource_type == "AWS::IAM::Role":
            if re.match(r".+/Custom::.+CustomResourceProvider/Role$", node.node.path):
                node.add_property_override("PermissionsBoundary", f"arn:aws:iam::{node.stack.account}:policy/cdk-{qualifier}-customresource-permissions-boundary-{node.stack.account}-{node.stack.region}")

As for your comment on CDK pipelines. Although having used CDK pipelines in the past I can't recall combining an Aspect like this with the pipelines. I don't see why it wouldn't work but then I haven't tried it. In the case of CDK pipelines I suspect that scope of the aspect is everything. As in: the CDK pipeline itself couldn't be subject to the permissions boundary as this would mean that it (and by it I mean the pipeline role) would be subject to the same "rules" (permissions boundary) as the stacks it's trying to deploy. Hence, you would probably - at a guess - just apply the Aspect to the "app stacks".

If I ever get a chance to combine the two I will let you know of the outcome...

Morten

dliu864 commented 1 month ago

It is also possible to patch using the Aspect as follows given that the type of object for the role is a CfnResource and it has a node path (example using Custom Resource to delete S3 bucket objects):

@jsii.implements(IAspect)
class IamPathFixer:

    def visit(self, node) -> None:
       if isinstance(node, iam.CfnRole):
           node.add_property_override("Path", "/approles/")
       elif isinstance(node, CfnResource):
            if "Custom::S3AutoDeleteObjectsCustomResourceProvider/Role" in node.node.path:
                node.add_property_override("Path", "/approles/")

Hi Morten,

I printed every node with an Aspect in my stack, however, AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867 did not get printed even though it is a AWS::IAM::Role and appears in the Cloudformation template. This means this role is not part of the construct tree structure and cannot be referenced at all within CDK. It's similar to how CDKMetadata is a resource in the Cloudformation but is not in the node tree structure and cannot be referenced.

David

adamtimmins commented 1 month ago

@adamtimmins actually above was just an example of getting to the resource you want to "patch" with an Aspect (in this case I assign an IAM path to isolate the "app" workload from other workloads). But you could equally do this as well (as an example)...

        elif isinstance(node, CfnResource) and node.cfn_resource_type == "AWS::IAM::Role":
            if re.match(r".+/Custom::.+CustomResourceProvider/Role$", node.node.path):
                node.add_property_override("PermissionsBoundary", f"arn:aws:iam::{node.stack.account}:policy/cdk-{qualifier}-customresource-permissions-boundary-{node.stack.account}-{node.stack.region}")

As for your comment on CDK pipelines. Although having used CDK pipelines in the past I can't recall combining an Aspect like this with the pipelines. I don't see why it wouldn't work but then I haven't tried it. In the case of CDK pipelines I suspect that scope of the aspect is everything. As in: the CDK pipeline itself couldn't be subject to the permissions boundary as this would mean that it (and by it I mean the pipeline role) would be subject to the same "rules" (permissions boundary) as the stacks it's trying to deploy. Hence, you would probably - at a guess - just apply the Aspect to the "app stacks".

If I ever get a chance to combine the two I will let you know of the outcome...

Morten

Appreciate your explanation here, just confirms what I suspect as well. I have already tried applying a custom aspect and an escape hatches to the stack as well as the CDK pipelines just to see if it makes a difference and nothing did.

It is also possible to patch using the Aspect as follows given that the type of object for the role is a CfnResource and it has a node path (example using Custom Resource to delete S3 bucket objects):

@jsii.implements(IAspect)
class IamPathFixer:

    def visit(self, node) -> None:
       if isinstance(node, iam.CfnRole):
           node.add_property_override("Path", "/approles/")
       elif isinstance(node, CfnResource):
            if "Custom::S3AutoDeleteObjectsCustomResourceProvider/Role" in node.node.path:
                node.add_property_override("Path", "/approles/")

Hi Morten,

I printed every node with an Aspect in my stack, however, AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867 did not get printed even though it is a AWS::IAM::Role and appears in the Cloudformation template. This means this role is not part of the construct tree structure and cannot be referenced at all within CDK. It's similar to how CDKMetadata is a resource in the Cloudformation but is not in the node tree structure and cannot be referenced.

David

This is exactly what I seeing as well. I'm printing all the node paths as well, and I'm not seeing AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867.

imduchy commented 1 month ago

@adamtimmins I was actually running into the same issue yesterday, even posted a comment here but then deleted it as I found the issue.

I was applying Aspects inside of the eks.Cluster's scope instead of directly at a Stack's scope. The issue was that CustomResources, that I suspected were part of the eks.Cluster's scope were actually created one level higher, at the Stack's scope. Not sure if that applies to you too but worth trying. Make sure you're applying your Aspects at the scope level.

mjvirt commented 1 month ago

Hi Morten,

I printed every node with an Aspect in my stack, however, AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867 did not get printed even though it is a AWS::IAM::Role and appears in the Cloudformation template. This means this role is not part of the construct tree structure and cannot be referenced at all within CDK. It's similar to how CDKMetadata is a resource in the Cloudformation but is not in the node tree structure and cannot be referenced.

David

Hi David, I don't recognise the name of that custom resource role (logical id). What I can tell you is that the example I provided found and patched the custom resource role (found as a CfnResource with cfn_resource_type=AWS::IAM::Role) underpinning the S3 bucket auto_delete_objects=True flag (logical id: SampleBucketAutoDeleteObjectsCustomResourceAC99DCF6). Perhaps mileage varies, but I have not yet accounted a scenario like yours.

The Aspect was "attached" at stack level with Aspects.of(sample_stack).add(IamRolePermissionsBoundary()) where IamRolePermissionsBoundary is the Aspect class that adds a Permissions Boundary to roles.

Morten

adamtimmins commented 4 weeks ago

@adamtimmins I was actually running into the same issue yesterday, even posted a comment here but then deleted it as I found the issue.

I was applying Aspects inside of the eks.Cluster's scope instead of directly at a Stack's scope. The issue was that CustomResources, that I suspected were part of the eks.Cluster's scope were actually created one level higher, at the Stack's scope. Not sure if that applies to you too but worth trying. Make sure you're applying your Aspects at the scope level.

@imduchy when I print all the node paths from the stack level I'm still not seeing the ID for the role. The same as well when I print all the node paths from the Stage as well. I've attempted to add the Aspect to the stack level and to add the permission boundary to all roles but it still does not add it.

Hi Morten, I printed every node with an Aspect in my stack, however, AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867 did not get printed even though it is a AWS::IAM::Role and appears in the Cloudformation template. This means this role is not part of the construct tree structure and cannot be referenced at all within CDK. It's similar to how CDKMetadata is a resource in the Cloudformation but is not in the node tree structure and cannot be referenced. David

Hi David, I don't recognise the name of that custom resource role (logical id). What I can tell you is that the example I provided found and patched the custom resource role (found as a CfnResource with cfn_resource_type=AWS::IAM::Role) underpinning the S3 bucket auto_delete_objects=True flag (logical id: SampleBucketAutoDeleteObjectsCustomResourceAC99DCF6). Perhaps mileage varies, but I have not yet accounted a scenario like yours.

The Aspect was "attached" at stack level with Aspects.of(sample_stack).add(IamRolePermissionsBoundary()) where IamRolePermissionsBoundary is the Aspect class that adds a Permissions Boundary to roles.

Morten

@mjvirt thanks for the explanation but I'm still not seeing node path on the stack level.

adamtimmins commented 3 weeks ago

I've managed to isolate where exactly the role is being created.

AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867 and it's corresponding Lambda AWSCDKCfnUtilsProviderCustomResourceProviderHandlerCF82AA57 are created when you reference a certain construct value within AwsCustomResource.

From my example this is when I reference vpc_endpoint_network_interface_ids in the below example.

AwsCustomResource(
    self,
    "describe-enis",
    on_update={
        "service": "EC2",
        "action": "describeNetworkInterfaces",
        "output_paths": output_paths,
        "parameters": {"NetworkInterfaceIds": vpc_endpoint.vpc_endpoint_network_interface_ids },
        "physical_resource_id": PhysicalResourceId.of(str(random.random())),
    },
    policy=AwsCustomResourcePolicy.from_statements(
        statements=[
            iam.PolicyStatement(
                effect=iam.Effect.ALLOW,
                actions=["ec2:DescribeNetworkInterfaces"],
                resources=["*"],
            ),
        ],
    ),
)

If I replace the parameters module in the AwsSdkCall with a random list of strings and synth the role AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867 and Lambda AWSCDKCfnUtilsProviderCustomResourceProviderHandlerCF82AA57 disappear.

AwsCustomResource(
    self,
    "describe-enis",
    on_update={
        "service": "EC2",
        "action": "describeNetworkInterfaces",
        "output_paths": output_paths,
        "parameters": {"NetworkInterfaceIds": ["w/e"]},
        "physical_resource_id": PhysicalResourceId.of(str(random.random())),
    },
    policy=AwsCustomResourcePolicy.from_statements(
        statements=[
            iam.PolicyStatement(
                effect=iam.Effect.ALLOW,
                actions=["ec2:DescribeNetworkInterfaces"],
                resources=["*"],
            ),
        ],
    ),
)

I tried looking around for what this Lambda does but I can't find anything. It's completely without context of the CDK app hence why adding a permission boundary with an custom aspect or an escape hatch won't work.

I'm not sure of a work around since the only way to make the custom resource work is to add interface ID's manually which is not really the point of my use case.

dliu864 commented 2 weeks ago

Hi @khushail Khurana, I have added you to my repo so you can reproduce the bug. Please let me know how it goes. Cheers, David On Wed, May 29, 2024 at 4:39 AM Shailja Khurana @.> wrote: @adamtimmins https://github.com/adamtimmins , Could you please share the the complete repro code. I also see there is a closed issue similar to the custom role mentioned above - #22972 <#22972> and previous attempts have been made for such similar issues like this PR - #14754 <#14754>. However this still seems like an issue so I am marking this as P1 for the appropriate traction. — Reply to this email directly, view it on GitHub <#30179 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2L3GKCI2FASVSH5J4S2CDZETFNBAVCNFSM6AAAAABHUCKCV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZVHA4DQMJXG4 . You are receiving this because you commented.Message ID: @.>

Hi khushail,

Just wondering if you were able to reproduce the bug with my repo code?

Cheers,

David

khushail commented 2 weeks ago

Hi @dliu864 , apologies for the delay in getting back. The code is in java and I am not that much familiar with it. Although I have been trying to repro it using typescript in my account.

I have marked the issue as P1 for the appropriate traction by the team as it has been reported by many customers.

dliu864 commented 2 weeks ago

AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867

Hi @dliu864 , apologies for the delay in getting back. The code is in java and I am not that much familiar with it. Although I have been trying to repro it using typescript in my account.

I have marked the issue as P1 for the appropriate traction by the team as it has been reported by many customers.

I found some Typescript CDK code which reproduces the same bug if that helps:

          Same issue found here: my organisation doesn't allow role creation without a boundary attached.

Or this ghost-role (AWSCDKCfnUtilsProviderCustomResourceProviderRoleFE0EE867) is not amendable...

Error:

9:36:27 AM | CREATE_FAILED        | AWS::IAM::Role                 | AWSCDKCfnUtilsProv...oviderRoleFE0EE867
Resource handler returned message: "Encountered a permissions error performing a tagging operation, please add required tag permissions. See https://re
post.aws/knowledge-center/cloudformation-tagging-permission-error for how to resolve. Resource handler returned message: "User: arn:aws:sts::2997035557
57:assumed-role/cf-app-noc-deployer-role/AWSCloudFormation is not authorized to perform: iam:CreateRole on resource: arn:aws:iam::299703555757:role/noc
-dmo-appsync-vpce-dev--AWSCDKCfnUtilsProviderCus-5WrO4F59gAcl because no identity-based policy allows the iam:CreateRole action (Service: Iam, Status C
ode: 403, Request ID: 51f6419b-9d7a-4611-b0be-912935de5f0d)"" (RequestToken: 348afcbd-35f2-ecef-c439-617cf13f7225, HandlerErrorCode: UnauthorizedTaggin
gOperation)

Stack:

export class VpceStack extends BaseStack {

    constructor(scope: Construct, id: string, props: BaseStackProps) {
        super(scope, id, props);

        const policies = new LambdaPolicies(this.locals);

        const defaultRole = new Role(this, this.id('lambda-basic-role'), {
            roleName: this.name('basic-role'),
            assumedBy: new ServicePrincipal('lambda.amazonaws.com'),
            managedPolicies: policies.managedPolicies(this),
            permissionsBoundary: this.project.defaultLambdaBoundary(),
            inlinePolicies: policies.inlinePolicies(this.project.cmk().keyArn),
        });

        // Retrieve VPC Restricted
        const vpcRestricted = Vpc.fromLookup(this, this.id("vpc-restricted"), {
            isDefault: false,
            vpcId: "vpc-02de2f0d152316a2b"
        });

        // Security groups
        const vpceAppsyncSG = new ec2.SecurityGroup(
            this, this.id("vpce-appsync-sg"),
            {
                vpc: vpcRestricted,
                securityGroupName: this.name("vpce-appsync-sg"),
                description: "Allow Access to AppSync",
                allowAllOutbound: true,
            },
        );

        // Create VPC Endpoint for AppSync with SG previously created
        const vpceAppSync = vpcRestricted.addInterfaceEndpoint(this.id("vpc-endpoint-appsync"),
            {
                service: ec2.InterfaceVpcEndpointAwsService.APP_SYNC,
                privateDnsEnabled: true,
                subnets: { subnetType: ec2.SubnetType.PRIVATE_ISOLATED },
                securityGroups: [vpceAppsyncSG],
            },
        );

        // find IpAddresses of VPCEndpoint attach to a subnet
        const getEndpointPrivateIpAddress = (index: number) => {
            const privateIpAddressField = `NetworkInterfaces.${index}.PrivateIpAddress`;
            const resource = new AwsCustomResource(this, `GetEndpointIp${index}`, {
                role: defaultRole,
                onUpdate: {
                    service: "EC2",
                    action: "describeNetworkInterfaces",
                    outputPaths: [privateIpAddressField],
                    parameters: {
                        NetworkInterfaceIds: vpceAppSync.vpcEndpointNetworkInterfaceIds,
                    },
                    physicalResourceId: PhysicalResourceId.of(privateIpAddressField),
                },
                policy: AwsCustomResourcePolicy.fromSdkCalls({
                    resources: AwsCustomResourcePolicy.ANY_RESOURCE,
                }),
            });
            return resource.getResponseField(privateIpAddressField);
        };

        this.output(this.id('EndpointPrivateIpAddress'), getEndpointPrivateIpAddress(0));
    }
}

Originally posted by @nocquidant in https://github.com/aws/aws-cdk/issues/22972#issuecomment-2066024383

adamtimmins commented 2 days ago

Had a back and forth with AWS premium support and was recommended the following solution, which worked for us (thanks Greg!).

The solution for us was to use iam.Role.customize_roles (https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_iam/Role.html#aws_cdk.aws_iam.Role.customize_roles) which allowed us to specify custom roles for the roles created by CDK. The only drawback here is that all roles in the stack itself will be replaced by whatever is specified in customize_roles. Any role not specified in customize_roles will not be found in your CloudFormation template on synth and will display an error.

First thing to do is synth your application as normal and take a note of the path of all the roles in the stack. You can find the path under "Metadata": { "aws:cdk:path": "CdkPipelines/ToolingDeployment/tooling-ApiGateway/AWSCDKCfnUtilsProviderCustomResourceProvider/Role" }.

Then at the top of the stack place your customize_roles with the specified paths and your own custom created role. Your own roles have to be created outside of said stack.

iam.Role.customize_roles(
    self,
    prevent_synthesis=True,
    use_precreated_roles={
        "CdkPipelines/ToolingDeployment/tooling-ApiGateway/AWSCDKCfnUtilsProviderCustomResourceProvider/Role": "custom_resource_role_name"
        ],
        "CdkPipelines/ToolingDeployment/tooling-ApiGateway/AWS679f53fac002430cb0da5b7982bd2287/ServiceRole": "custom_resource_role_name"
        ],
    },
)

It's important to set prevent_synthesis as True or else the CloudFormation template will contain the CDK created roles. It will however error out if a certain role is no addressed in customize_roles.

Hope this helps!