aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.72k stars 3.94k forks source link

(@aws-cdk/aws-codedeploy): LambdaDeploymentGroup not able to use canary alarm #13922

Closed zacyang closed 1 year ago

zacyang commented 3 years ago
        const deploymentGroup = new codedeploy.LambdaDeploymentGroup(
            this,
            'BlueGreenDeployment',
            {
                application,
                alias: api.versionAlias,
                deploymentConfig:
                    codedeploy.LambdaDeploymentConfig.LINEAR_10PERCENT_EVERY_1MINUTE,
            }
        );

        const alarm = new cloudwatch.Alarm(this, 'CanaryAlarm', {
            metric: apiCanary.canary.metricFailed(),
            evaluationPeriods: 3,
            datapointsToAlarm: 2,
            threshold: 1,
            comparisonOperator:
                cloudwatch.ComparisonOperator.GREATER_THAN_OR_EQUAL_TO_THRESHOLD,
        });

        deploymentGroup.addAlarm(alarm);

Reproduction Steps

just create a simple app parse the above code would reproduce

What did you expect to happen?

on 1.93.0 I am trying to use blue green deployment for deploy new lambda, where if the canary test failed during deployment, I'd expect LambdaDeploymentGroup to pick it up and revert.

What actually happened?

on cdk deploy However the deployment reported Error [ValidationError]: Circular dependency between resources:

Environment

Other


This is :bug: Bug Report

skinny85 commented 3 years ago

Hey @zacyang ,

thanks for opening the issue. Can you show how you create api and apiCanary?

Thanks, Adam

zacyang commented 3 years ago

Thanks for your quick response,

for api it's just a Construct creates and expose apigateway.LambdaRestApi and lambda.Alias to the outside

apiCanary is a synthetics.Canary executing test against the API gw created by LambdaRestApi

In addition, if I just swich the metric from the canary.metricXXX() to the lambdaAlias.metricXXX it's working fine

skinny85 commented 3 years ago

Can you show the outline of the code that is creating api and apiCanary?

I'm afraid we won't be able to diagnose this issue otherwise.

zacyang commented 3 years ago

api is a consutrcut, the gist of it something like: a restAPI with LambdaRestApi pointed to the lambda alias

export interface XXXRestApiProps {
  lambdaFunction: XXXLambdaFunction;
}

export class  XXXRestApi extends cdk.Construct {
  //...
  this.versionAlias = new lambda.Alias(this, 'alias', {
            aliasName: 'prod',
            version: props.lambdaFunction.lambdaFunction.currentVersion,
        });
  const gwTarget: IFunction = props.enableAlias
            ? this.versionAlias
            : props.lambdaFunction.lambdaFunction;

  this.api = new apigateway.LambdaRestApi(this, `API`, {
    //...
    handler: gwTarget,
  }
}

apiCanary is another construct warped canary and it's policy within:

export class ApiCanary extends cdk.Construct {
    public readonly canary: synthetics.Canary;

    constructor(scope: cdk.Construct, id: string, props: CanaryProps) {
        super(scope, id);

        const stack = cdk.Stack.of(this);

        // create s3 bucket for canary log output
        const canaryS3Bucket = new s3.Bucket(this, `api-canary-logs`, {
            versioned: true,
            blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
            accessControl: s3.BucketAccessControl.PRIVATE,
            removalPolicy: cdk.RemovalPolicy.DESTROY,
        });

        // create policy statement
        const canaryIamPolicyDocument = new iam.PolicyDocument();
        canaryIamPolicyDocument.addStatements(
            new iam.PolicyStatement({
                effect: iam.Effect.ALLOW,
                actions: ['s3:PutObject', 's3:GetBucketLocation'],
                resources: [`${canaryS3Bucket.bucketArn}/*`],
            }),
            new iam.PolicyStatement({
                effect: iam.Effect.ALLOW,
                actions: [
                    'logs:CreateLogStream',
                    'logs:PutLogEvents',
                    'logs:CreateLogGroup',
                ],
                resources: ['*'],
            }),
            new iam.PolicyStatement({
                effect: iam.Effect.ALLOW,
                actions: ['s3:ListAllMyBuckets'],
                resources: ['*'],
            }),
            new iam.PolicyStatement({
                effect: iam.Effect.ALLOW,
                actions: ['cloudwatch:PutMetricData'],
                conditions: {
                    StringEquals: {
                        'cloudwatch:namespace': 'CloudWatchSynthetics',
                    },
                },
                resources: ['*'],
            }),
            new iam.PolicyStatement({
                effect: iam.Effect.ALLOW,
                actions: ['apigateway:*'],
                resources: [
                    `arn:aws:apigateway:${stack.region}::/restapis/${props.apiId}/*`,
                ],
            })
        );

        // create canary execution role
        const canaryRole = new iam.Role(this, `api-canary-role`, {
            assumedBy: new iam.ServicePrincipal('lambda'),
            inlinePolicies: {
                CanaryIamPolicy: canaryIamPolicyDocument,
            },
            description: 'Execution role for the api canary',
        });

        // create canary
        this.canary = new synthetics.Canary(this, id, {
            artifactsBucketLocation: { bucket: canaryS3Bucket },
            role: canaryRole,
            runtime: synthetics.Runtime.SYNTHETICS_NODEJS_PUPPETEER_3_0,
            test: synthetics.Test.custom({
                code: synthetics.Code.fromInline(
                    file
                        .readFileSync(
                            path.join(__dirname, props.canarySourceRelativePath)
                        )
                        .toString()
                ),
                handler: 'index.handler',
            }),
            canaryName: props.canaryName ?? `${id}-cannary`,
            schedule:
                props.scheduleForCanary ?? synthetics.Schedule.rate(Duration.seconds(60)),
            startAfterCreation: true,
        });

        const child = this.canary.node.defaultChild as synthetics.CfnCanary;
        child.addPropertyOverride('RunConfig.EnvironmentVariables', {
            TEST_TARGET_API: props.apiUrl,
        });
    }
}
skinny85 commented 3 years ago

Thanks @zacyang . This is pretty complicated. Any chance of providing us a minimal reproduction in an example CDK app, pushed to a public GitHub repository?

If not, I'll try to create a minimal reproduction myself, but your help would definitely speed things up here.

Let me know!

zacyang commented 3 years ago

Hi @skinny85, sorry I might not have time to exact a reproducible repo due to working schedule.

I think the simplest app could be

canary (send request)->  api GW -> lambda (with alias)
deployGroup (listen to) -> alarm (of the canary metrics)

I think the problem might be the canary target at API gw (as CDK resource), if I replace the target with fixed api ID then it seems fine. But that would mean something is not managed by CDK.

skinny85 commented 3 years ago

Can you post the full error please?

zacyang commented 3 years ago

Sorry I can not parse the full error as it might breach the policy. Also without the full cfn I don't think it worth much as it just list of generated resource logicIds.

After some digging I found the point where to break the dependency chain. So in short, I have resources

  1. API gateway
  2. Lambda alias
  3. Lambda
  4. Canary (sending request to API gateway)
  5. Alarm created by canary
  6. codeDeploy group watches on Canary Alarms

On paper there should be no circular dependency. I have a PolicyStatement like

 new iam.PolicyStatement({
                effect: iam.Effect.ALLOW,
                actions: ['apigateway:*'],
                resources: [
                    `arn:aws:apigateway:${stack.region}::/restapis/${targetApi.restApiId}/*`,
                ],
            })

where targetApi is the CDK resource for point 1 API gateway.

originally this statement is created during the canaryRole creation, (where the circular reported)

No if I moved it outside (after role and canary creation) e.g

 const canaryIamPolicyDocument = new iam.PolicyDocument();
 canaryIamPolicyDocument.addStatements(...)
 const canaryRole = new iam.Role(this, `api-canary-role`, {
            assumedBy: new iam.ServicePrincipal('lambda'),
            inlinePolicies: {
                CanaryIamPolicy: canaryIamPolicyDocument,
            },
            description: 'Execution role for the api canary',
        });
  this.canary = new synthetics.Canary(this, id, { 
...
   role: canaryRole})

//Attach the permission after canary and role creatation.
canaryRole.addToPolicy(
            new iam.PolicyStatement({
                effect: iam.Effect.ALLOW,
                actions: ['apigateway:*'],
                resources: [
                    `arn:aws:apigateway:${stack.region}::/restapis/${targetApi.restApiId}/*`,
                ],
            })
        );

Then it works.

Fallenstedt commented 3 years ago

I encountered this too recently, and i have a code example that might help.

I created a LambdaDeploymentGroup so I can do blue/green deployment of my lambda functions. I attempted to create an alarm on my alias function, and encountered a similar error described above.

You can try this out by changing mylambda to aliaslambda on this line: https://github.com/Fallenstedt/aws-cdk-lambda-examples/blob/master/lambda-with-alias/lib/lambda-with-code-deploy-stack.ts#L56

I expected an alias lambda would be able to create an alarm. Fortunately, everything works as expected with the alarm on the original lambda.

skinny85 commented 2 years ago

@Fallenstedt I can't replicate the problem. Here's my code:

import * as cdk from '@aws-cdk/core';
import * as codedeploy from '@aws-cdk/aws-codedeploy';
import * as lambda from '@aws-cdk/aws-lambda';

export class TestStack extends cdk.Stack {
    constructor(app: cdk.App, id: string) {
        super(app, id);

        const mylambda = new lambda.Function(this, "Lambda", {
            runtime: lambda.Runtime.NODEJS_12_X,
            handler: "main",
            code: lambda.Code.fromInline('mycode'),
        });
        const aliasLambda = new lambda.Alias(this, "LambdaAlias", {
            aliasName: "LambdaAlias",
            version: mylambda.currentVersion,
        });

        // const lambdaErrors = mylambda.metricErrors({
        const lambdaErrors = aliasLambda.metricErrors({
            period: cdk.Duration.minutes(1),
            statistic: "sum",
            dimensionsMap: {
                FunctionName: mylambda.functionName,
                Resource: `${mylambda.functionName}:LambdaAlias`,
            },
        });
        const lambdaAlarm = lambdaErrors.createAlarm(this, "LambdaFailureAlarm", {
            alarmDescription: "The latest deployment errors are greater than zero",
            threshold: 1,
            evaluationPeriods: 1,
        });
        new codedeploy.LambdaDeploymentGroup(this, "MyLambdaDeploymentGroup", {
            alias: aliasLambda,
            deploymentConfig: codedeploy.LambdaDeploymentConfig.CANARY_10PERCENT_5MINUTES,
            alarms: [lambdaAlarm],
        });
    }
}

I'm able to cdk deploy this without any problems...

github-actions[bot] commented 1 year ago

This issue has not received any attention in 1 year. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.