aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.65k stars 3.91k forks source link

aws-lambda: unable to upload function code asset to cdk asset bucket #23249

Closed csmcallister closed 1 year ago

csmcallister commented 1 year ago

Describe the bug

I've got a few CDK apps written in v1 as well as a few written in v2. Today, when creating a new v2 CDK project from scratch, I am unable to deploy due to an error with uploading the CDK asset for the Lambda function code. The relevant error snippet is:

Bucket named 'cdk-<random-identifier>-assets-<account-id>-<region>' exists, but not in account <account-id>. Wrong account?

I assure you the account ids in the bucket name as well as the second half of the error message are the same.

Expected Behavior

I expected cdk deploy to utilize the CDK resources (staging and assets s3 buckets) after a successful cdk bootstrap.

Current Behavior

The full error of my cdk deploy command, with my personal info, account number and region redacted:

Do you wish to deploy these changes (y/n)? y
CdkBugStack: deploying...
[0%] start: Publishing 6db6c214750fd7227d61d7accfde57987808698c67cb93a0fbd0328d10fc2926:<account-id>-<region>
[0%] start: Publishing 81ab06eacb598337b1c4663146a3ef12f7b208990c3f0fa96f2692235a73fdca:<account-id>-<region>
[50%] fail: Bucket named 'cdk-<random-id>-assets-<account-id>-<region>' exists, but not in account <account-id>. Wrong account?
[100%] fail: Bucket named 'cdk-<random-id>-assets-<account-id>-<region>' exists, but not in account <account-id>. Wrong account?

 ❌  CdkBugStack failed: Error: Failed to publish one or more assets. See the error messages above for more information.
    at publishAssets (/Users/<user>/.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/util/asset-publishing.ts:60:11)
    at CloudFormationDeployments.publishStackAssets (/Users/<user>//.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/api/cloudformation-deployments.ts:572:7)
    at CloudFormationDeployments.deployStack (/Users/<user>//.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/api/cloudformation-deployments.ts:419:7)
    at deployStack2 (/Users/<user>//.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/cdk-toolkit.ts:264:24)
    at /Users/<user>//.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/deploy.ts:39:11
    at run (/Users/<user>//.nvm/versions/node/v16.10.0/lib/node_modules/p-queue/dist/index.js:163:29)

 ❌ Deployment failed: Error: Stack Deployments Failed: Error: Failed to publish one or more assets. See the error messages above for more information.
    at deployStacks (/Users/<user>//.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/deploy.ts:61:11)
    at CdkToolkit.deploy (/Users/<user>//.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/cdk-toolkit.ts:338:7)
    at initCommandLine (/Users/<user>//.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/cli.ts:364:12)

Stack Deployments Failed: Error: Failed to publish one or more assets. See the error messages above for more information.

Reproduction Steps

I created a whole new CDK v2 project from scratch to reproduce this. Here are those steps:

First, install deps and create the new project

mkdir cdk-bug
cd cdk-bug
node --version  # v16.10.0
npm install -g aws-cdk@2.53.0
cdk --version  # 2.53.0 (build 7690f43)
cdk init app --language typescript
export CDK_DEFAULT_ACCOUNT=<account-id>
export CDK_DEFAULT_REGION=<account-region>

In ./lib/cdk-bug-stack.ts I added a lambda function using lambda.Code.fromAsset()

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as path from 'path';

export class CdkBugStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    new lambda.Function(this, 'CdkBugFunc', {
      runtime: lambda.Runtime.PYTHON_3_8,
      handler: 'main.handler',
      code: lambda.Code.fromAsset(path.join(__dirname, '..', 'app')),
    });
  }
}

It references some Python code in ./app/main.py:

def handler(event, context):
    return

And in ./bin/cdk-bug.ts I added the env prop to set my AWS Account and Region:

#!/usr/bin/env node
import 'source-map-support/register';
import * as cdk from 'aws-cdk-lib';
import { CdkBugStack } from '../lib/cdk-bug-stack';

const app = new cdk.App();
new CdkBugStack(app, 'CdkBugStack', {
  env: {
    account: process.env.CDK_DEFAULT_ACCOUNT,
    region: process.env.CDK_DEFAULT_REGION,
  },
});

My AWS credentials are configured with a named profile, defined in ~/.aws/credentials:

[mfa]
aws_access_key_id = <aws_access_key_id>
aws_secret_access_key = <aws_secret_access_key>
aws_session_token = <aws_session_token>
region = <region>

The region for this profile is the same as the CDK_DEFAULT_REGION environment variable.

As hinted in the profile name, my organization requires MFA, so those values in my credentials file were created using the following:

aws sts get-session-token \
    --serial-number arn:aws-us-gov:iam::${CDK_DEFAULT_ACCOUNT}:mfa/<iam-user> \
    --token-code <otp-token>

Above, the <iam-user>'s programmatic access credentials can be found in the [default] profile within ~/.aws/credentials.

I know this mfa profile works because of the following:

➜ aws sts get-caller-identity --profile mfa
{
    "UserId": "<string>",
    "Account": "<account-id>",
    "Arn": "arn:aws-us-gov:iam::<account-id>:user/<iam-user>"
}

And I can even use this named profile to list the contents of my CDK assets bucket (created by a previous cdk app) using the S3 cli:

aws s3 ls s3://cdk-<random-identifier>-assets-${CDK_DEFAULT_ACCOUNT}-${CDK_DEFAULT_REGION} --profile mfa

Now here's where it gets weird.

If I bootstrap this project with the following:

cdk bootstrap aws://${CDK_DEFAULT_ACCOUNT}/${CDK_DEFAULT_REGION} --profile mfa

I get the following success message:

 ⏳  Bootstrapping environment aws://<account-id>/<region>...
Trusted accounts for deployment: <account-id>
Trusted accounts for lookup: <account-id>
Execution policies: arn:aws-us-gov:iam::aws:policy/AdministratorAccess
 ✅  Environment aws://<account-id>/<region> bootstrapped (no changes).

As you can see, the execution policy is AdministratorAccess. That is the same managed policy attached to my IAM user.

But then on deployment, using cdk deploy --profile mfa, I get the following error:

Do you wish to deploy these changes (y/n)? y
CdkBugStack: deploying...
[0%] start: Publishing 6db6c214750fd7227d61d7accfde57987808698c67cb93a0fbd0328d10fc2926:<account-id>-<region>
[0%] start: Publishing 81ab06eacb598337b1c4663146a3ef12f7b208990c3f0fa96f2692235a73fdca:<account-id>-<region>
[50%] fail: Bucket named 'cdk-<random-identifier>-assets-<account-id>-<region>' exists, but not in account <account-id>. Wrong account?
[100%] fail: Bucket named 'cdk-<random-identifier>-assets-<account-id>-<region>' exists, but not in account <account-id>. Wrong account?

I found others with a similar error message in 6808, but the root cause for them seemed to be a lack of permissions in the IAM execution policy used by the CDK. That is, they needed s3:GetObject, s3:PutObject, s3:ListBucket, and s3:GetBucketLocation on resource cdk-*-assets-*-*. This shouldn't be an issue for me since that AdministratorAccess policy covers everything:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "*",
            "Resource": "*"
        }
    ]
}

Just to test if my mfa profile does indeed have permissions, I tried uploading the python script myself to that assets bucket:

aws s3 cp \
    ./app/main.py  \
    s3://cdk-<random-identifier>-assets-${CDK_DEFAULT_ACCOUNT}-${CDK_DEFAULT_REGION}/main.py \
    --profile mfa

This works. And I can - and have - performed all sorts of other aws cli calls using --profile mfa in the past.

Possible Solution

Not sure.

Additional Information/Context

Operating in AWS gov-cloud.

CDK CLI Version

2.53.0 (build 7690f43)

Framework Version

No response

Node.js Version

16.10.0

OS

macOS Catalina 10.15.6

Language

Typescript

Language Version

~3.9.7

Other information

No response

peterwoodworth commented 1 year ago

The error message looks a bit misleading, as it can occur for more reasons than the bucket being in the wrong account. This error will occur when you are getting either an AllAccessDisabled or AccessDenied error when making the getBucketLocation() call.

Here are a few things I think you can try

csmcallister commented 1 year ago

The error message looks a bit misleading, as it can occur for more reasons than the bucket being in the wrong account. This error will occur when you are getting either an AllAccessDisabled or AccessDenied error when making the getBucketLocation() call.

Here are a few things I think you can try

  • Check CloudTrail to find the getBucketLocation API call that failed, and compare it to a successful one made with the CLI or JavaScript SDK. Maybe you can find something different that will help us figure out what's going on
  • Check the deploy-role to make sure it has GetBucket* permissions
  • Try setting your credentials as environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN

I'll check your first bulleted suggestion and report back once able. Any tips on filtering the cloud trail event logs for that event would be appreciated as I'm a novice there. I assume I'll be able to use aws cloudtrail lookup-events ....

Is the deploy role you mentioned different than the CDK's execution policy? You can see in the bootstrap output that the execution policy is the AdministratorAccess managed policy. And the CDK docs here state that the execution policy is the "deployment role assumed by CloudFormation during deployment of your stacks."

I already tried setting those environment variables and removing the --profile mfa argument. Same error with a cdk deploy but success using other API calls from the AWS cli.

peterwoodworth commented 1 year ago

Is the deploy role you mentioned different than the CDK's execution policy? You can see in the bootstrap output that the execution policy is the AdministratorAccess managed policy. And the CDK docs here state that the execution policy is the "deployment role assumed by CloudFormation during deployment of your stacks."

Yes, there is a deployment role and a cloudformation-execution role. The former will be assumed by the CDK Toolkit (or in a pipeline) to perform standard and necessary deployment operations. The latter is the role which is passed directly to CloudFormation to use when CloudFormation is creating the stack. Since CloudFormation isn't the one making the getBucketLocation call, this means the execution role won't be the one trying to make this call.

csmcallister commented 1 year ago

Ok I think we're getting somewhere.

I ran the following to generate in CloudTrail a successful getBucketLocation api call:

aws s3api get-bucket-location \
    --bucket cdk-<random-id>-assets-<account-id>-<region> \
    --profile mfa

In CloudTrail, I can see the following for that successful api call:

"userIdentity": {
        "type": "IAMUser",
        "principalId": "<principal-id>",
        "arn": "arn:aws-us-gov:iam::<account-id>:user/<my-iam-user>",
        "accountId": "<account-id>",
        "accessKeyId": "<access-key-id>",
        "userName": "<my-iam-user>",
        "sessionContext": {
            "sessionIssuer": {},
            "webIdFederationData": {},
            "attributes": {
                "creationDate": "2022-12-07T21:22:37Z",
                "mfaAuthenticated": "true"
            }
        }
    },

Running cdk deploy --profile mfa again fails with the original error and I can see in CloudTrail that the userIdentity is now of the type AssumedRole and points to a role ARN of the form arn:aws-us-gov:sts::<account-id>:assumed-role/cdk-<random-identifier>-file-publishing-role-<account-id>-<region>/aws-cdk-<my-iam-user>.

Lo and behold, that role lacks the getBucketLocation action as well as the cdk-assets bucket resource.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:GetObject*",
                "s3:GetBucket*",
                "s3:GetEncryptionConfiguration",
                "s3:List*",
                "s3:DeleteObject*",
                "s3:PutObject*",
                "s3:Abort*"
            ],
            "Resource": [
                "arn:aws-us-gov:s3:::cdktoolkit-stagingbucket-cdk-bug",
                "arn:aws-us-gov:s3:::cdktoolkit-stagingbucket-cdk-bug/*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "kms:Decrypt",
                "kms:DescribeKey",
                "kms:Encrypt",
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*"
            ],
            "Resource": "arn:aws-us-gov:kms:<region>:<account-id>:key/AWS_MANAGED_KEY",
            "Effect": "Allow"
        }
    ]
}

Naturally, I added s3:GetBucketLocation to that first actions array and then added arn:aws-us-gov:s3:::cdk-*-assets-*-* to that first resources array. I also noticed that the bucket name in the resources array is cdktoolkit-stagingbucket-cdk-bug, but no such bucket exists in my account. I do, however, have buckets named like cdktoolkit-stagingbucket-<random-identifier>.

At any rate, I tried cdk deploy --profile mfa after updating the policy and now see a new error (which I still count as progress):

Do you wish to deploy these changes (y/n)? y
CdkBugStack: deploying...
[0%] start: Publishing 6db6c214750fd7227d61d7accfde57987808698c67cb93a0fbd0328d10fc2926:395913319687-<region>
[0%] start: Publishing 81ab06eacb598337b1c4663146a3ef12f7b208990c3f0fa96f2692235a73fdca:395913319687-<region>
[50%] success: Published 81ab06eacb598337b1c4663146a3ef12f7b208990c3f0fa96f2692235a73fdca:395913319687-<region>
[100%] success: Published 6db6c214750fd7227d61d7accfde57987808698c67cb93a0fbd0328d10fc2926:395913319687-<region>
CdkBugStack: creating CloudFormation changeset...

 ❌  CdkBugStack failed: Error [ValidationError]: S3 error: Access Denied
For more information check http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html
    at Request.extractError (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/protocol/query.js:50:29)
    at Request.callListeners (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
    at Request.emit (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
    at Request.emit (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/request.js:686:14)
    at Request.transition (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/request.js:22:10)
    at AcceptorStateMachine.runTo (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/state_machine.js:14:12)
    at /Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/state_machine.js:26:10
    at Request.<anonymous> (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/request.js:38:9)
    at Request.<anonymous> (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/request.js:688:12)
    at Request.callListeners (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-sdk/lib/sequential_executor.js:116:18) {
  code: 'ValidationError',
  time: 2022-12-07T21:46:54.362Z,
  requestId: '7d48fd5d-288b-40cc-8a3d-504d2af8a7e0',
  statusCode: 400,
  retryable: false,
  retryDelay: 855.9092614166715
}

 ❌ Deployment failed: Error: Stack Deployments Failed: ValidationError: S3 error: Access Denied
For more information check http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html
    at deployStacks (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/deploy.ts:61:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at CdkToolkit.deploy (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/cdk-toolkit.ts:338:7)
    at initCommandLine (/Users/me/.nvm/versions/node/v16.10.0/lib/node_modules/aws-cdk/lib/cli.ts:364:12)

Stack Deployments Failed: ValidationError: S3 error: Access Denied
For more information check http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html

I literally have to walk out the door right now, but I imagine the next debugging step is to lookup that requestId of 7d48fd5d-288b-40cc-8a3d-504d2af8a7e0, right?

peterwoodworth commented 1 year ago

Looking at that request will probably give you some more insights yeah, but I'm confused at why the file publishing role is being used here.

I'm also not quite sure why the resource policies have bug in the name of the bucket. This would indicate to me that you've bootstrapped using an old version, or at some point you supplied the FileAssetsBucketName option when bootstrapping and that is no longer relevant.

I can help investigate more once you get back to me with the findings of that new request 🙂

csmcallister commented 1 year ago

I found the failed deployment in CloudTrail as a CreateChangeSet event. In the Event Record, I can see the role arn:aws-us-gov:iam::<account-id>:role/cdk-<random-identifier>-deploy-role-<account-id>-<region> was used. Here's the policy attached to that role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "cloudformation:CreateChangeSet",
                "cloudformation:DeleteChangeSet",
                "cloudformation:DescribeChangeSet",
                "cloudformation:DescribeStacks",
                "cloudformation:ExecuteChangeSet",
                "cloudformation:CreateStack",
                "cloudformation:UpdateStack"
            ],
            "Resource": "*",
            "Effect": "Allow",
            "Sid": "CloudFormationPermissions"
        },
        {
            "Condition": {
                "StringNotEquals": {
                    "s3:ResourceAccount": "<account-id>"
                }
            },
            "Action": [
                "s3:GetObject*",
                "s3:GetBucket*",
                "s3:List*",
                "s3:Abort*",
                "s3:DeleteObject*",
                "s3:PutObject*"
            ],
            "Resource": "*",
            "Effect": "Allow",
            "Sid": "PipelineCrossAccountArtifactsBucket"
        },
        {
            "Condition": {
                "StringEquals": {
                    "kms:ViaService": "s3.<region>.amazonaws.com"
                }
            },
            "Action": [
                "kms:Decrypt",
                "kms:DescribeKey",
                "kms:Encrypt",
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*"
            ],
            "Resource": "*",
            "Effect": "Allow",
            "Sid": "PipelineCrossAccountArtifactsKey"
        },
        {
            "Action": "iam:PassRole",
            "Resource": "arn:aws-us-gov:iam::<account-id>:role/cdk-<random-identifier>-cfn-exec-role-<account-idd>-<region>",
            "Effect": "Allow"
        },
        {
            "Action": [
                "cloudformation:DescribeStackEvents",
                "cloudformation:GetTemplate",
                "cloudformation:DeleteStack",
                "cloudformation:UpdateTerminationProtection",
                "sts:GetCallerIdentity",
                "cloudformation:GetTemplateSummary"
            ],
            "Resource": "*",
            "Effect": "Allow",
            "Sid": "CliPermissions"
        },
        {
            "Action": [
                "s3:GetObject*",
                "s3:GetBucket*",
                "s3:List*"
            ],
            "Resource": [
                "arn:aws-us-gov:s3:::cdktoolkit-stagingbucket-cdk-bug",
                "arn:aws-us-gov:s3:::cdktoolkit-stagingbucket-cdk-bug/*"
            ],
            "Effect": "Allow",
            "Sid": "CliStagingBucket"
        },
        {
            "Action": [
                "ssm:GetParameter"
            ],
            "Resource": [
                "arn:aws-us-gov:ssm:us-gov-west-1:<account-id>:parameter/cdk-bootstrap/<random-id>/version"
            ],
            "Effect": "Allow",
            "Sid": "ReadVersion"
        }
    ]
}

A few things strike me as odd with that policy:

Before I change the buckets in that policy, I want to see what actually happened when I bootstrapped. So I re-ran cdk bootstrap aws://${CDK_DEFAULT_ACCOUNT}/${CDK_DEFAULT_REGION} --show-template --profile mfa >> template.yaml. Within template.yaml, I only see the staging bucket - without that cdk-bug suffix. There is no assets bucket:

  StagingBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName:
        Fn::If:
          - HasCustomFileAssetsBucketName
          - Fn::Sub: ${FileAssetsBucketName}
          - Fn::Sub: cdk-${Qualifier}-assets-${AWS::AccountId}-${AWS::Region}

To make sure I didn't bootstrap with an older version of the CDK, I can verify that cdk --version still gives me 2.53.0 (build 7690f43). To take it one step further, I ran the following to see if that template.yml differs from what was previously bootstrapped:

➜ cdk bootstrap --template template.yaml --profile mfa
Using bootstrapping template from template.yaml
 ⏳  Bootstrapping environment aws://<account-id>/<region>..
Trusted accounts for deployment: <account-id>/
Trusted accounts for lookup: <account-id>/
Execution policies: arn:aws-us-gov:iam::aws:policy/AdministratorAccess
 ✅  Environment aws://<account-id>//<region> bootstrapped (no changes).

That last note of (no changes) suggests to me that I didn't bootstrap with the wrong CDK version or inadvertently supply any custom bucket names.

Finally, replacing:

"Resource": [
      "arn:aws-us-gov:s3:::cdktoolkit-stagingbucket-cdk-bug",
      "arn:aws-us-gov:s3:::cdktoolkit-stagingbucket-cdk-bug/*"
  ],

with:

"Resource": [
    "arn:aws-us-gov:s3:::cdk-<random-identifer>-assets-<account-id>-<region>",
    "arn:aws-us-gov:s3:::cdk-<random-identifer>-assets-<account-id>-<region>/*"
],

in that IAM policy results in a successful cdk deploy --profile mfa:

✅  CdkBugStack

✨  Deployment time: 46.94s

🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉

peterwoodworth commented 1 year ago

I'm really glad to hear you got this working! Based on the existing policy you shared, it really looks like this stack was originally bootstrapped with Legacy CDK v1 bootstrapping. Why this would result in no changes on your next attempt, I'm not sure, but I'm glad you got this working!

github-actions[bot] commented 1 year ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.