yvele / aws-cfn-custom-resource-lambda-edge

🏗 AWS CloudFormation custom resource that allows deploying Lambda@Edge from any region
Apache License 2.0
19 stars 3 forks source link

Graceful deletion #1

Open stonedMoose opened 3 years ago

stonedMoose commented 3 years ago

Hi,

First I would to like thank you for this project, you did a great job and it helped me set up cloudfront lambda@edge on my current project.

I have a little problem regarding graceful deletion of my stack when using your custom resources. The stack always fails to delete itself because the lambda@edge function cannot be deleted.

Do you have any tips on how we could manage a graceful stack deletion using your custom resource ?

Thanks for your help !

yvele commented 3 years ago

Hi, thank you.

Unfortunately I'm facing the exact same problem and I have to manually delete things.

That's what I personally do:

  1. First I'm deleting the stack that contains the CloudFront distribution
  2. The deletion will take some time and will fail (because Lambda@Edge are spread all around the world and takes time to be deleted)
  3. I manually hit the delete stack button in AWS console an retain the custom resources that contains the Lambda@Edge stacks (in us-east-1)
  4. The stack is then completely deleted (expect for the Lambda@Edge stacks created by the custom resource in us-east-1)
  5. Then I go to us-east-1 and manually delete the CloudFormation stacks that contains Lambda@Edge. Usually the stacks are removed straight ahead. If failure occurs, just retry and it'll work.

I'm know that's a bit hacky and needs manual intervention but I'm using this technique for 2 years now without much problem.

I hope that will help you.

If anyone know of to properly handle the Lambda@Edge deletion within the custom resource, please feel free to contribute 😉

Edit: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-edge-delete-replicas.html

stonedMoose commented 3 years ago

Thanks for your answer.

I was doing the same actually... until I wanted to deploy an ephemeral environment to test against. These tests would have been launched for every commit pushed on my project. So It would have been impossible for me to delete these environments manually.

I ended up trying something else. I add Retain as deletion policy for the Custom::LambdaEdgeressources. Doing so my main stack deletion won't failed. But I still have to delete stack created by the custom resource in us-east-1, so I added the stack name as an output of the custom resource. With this output, I can, in my main stack output the stack name created by the custom resource. And my deletion workflow become:

  1. Retrieve stack name of custom resources
  2. Delete main stack
  3. Delete Custom resources stack. This will failed, but leave the stack as Delete Failed

With these steps, I'm able to delete automatically my stack and mark the custom resources stack as Delete failed. So now, I just have to, once in a while, go to cloudformation ui in us-east-1and delete the stack that failed to delete.

yvele commented 3 years ago

You can see that I tried my best to add a deletion retry strategy within the CloudFormation custom resource but this isn't enough 😞

https://github.com/yvele/aws-cfn-custom-resource-lambda-edge/blob/c9e6678154f465f561039264f81bb66a75ddcc30/stacks/custom-resource-lambda-edge/src/deleteStack.js#L40-L59

See also:

NickDarvey commented 3 years ago

, so I added the stack name as an output of the custom resource. With this output, I can, in my main stack output the stack name created by the custom resource.

@stonedMoose, how are you doing that? A GetAttr on the Custom::LambdaEdge resource?

stonedMoose commented 3 years ago

@NickDarvey Sorry I've just seen your question... I think, I did modify this repository to add the output I needed:

index 4b6419a..a4945f5 100644
--- a/stacks/custom-resource-lambda-edge/cloudformation.yml
+++ b/stacks/custom-resource-lambda-edge/cloudformation.yml
@@ -129,3 +129,6 @@ Outputs:
     Value: !GetAtt Function.Arn
     Export:
       Name: CustomResourceLambdaEdgeServiceToken
+  StackName:
+    Description: Stack name
+    Value: !Sub ${AWS::StackName}
diff --git a/stacks/custom-resource-lambda-edge/src/cloudformation.yml b/stacks/custom-resource-lambda-edge/src/cloudformation.yml
index 7fd342b..0ff1dd6 100644
--- a/stacks/custom-resource-lambda-edge/src/cloudformation.yml
+++ b/stacks/custom-resource-lambda-edge/src/cloudformation.yml
@@ -114,3 +114,6 @@ Outputs:
       - UseDefaultFunctionRole
       - !GetAtt LambdaRole.Arn
       - !Ref FunctionRole
+  StackName:
+    Description: Stack name
+    Value: !Sub ${AWS::StackName}

In the end I ended switching to aws cdk instead of cloudformation :D and I'm using this package to handle lambda@edge: https://www.npmjs.com/package/@cloudcomponents/cdk-lambda-at-edge-pattern

The deletion is handle gracefully in it, so maybe there is something to look for ^^

EDIT: the deletion is not handle gracefully after all... I have a home made script to handle deployment and undeployment, that basically use cdk command to deploy my stacks. When deleting my stack, I first destroy the lambda@edge stack, it fails because of function replicas. Then I destroy the stack retaining the replicas and waiting a bit between the two operations.

It looks like the following using aws cli:

function delete_stack() {
    log "Deleting stack $1..."
    region=${2:-eu-west-1}
    aws cloudformation delete-stack --stack-name "$1" --region "$region" --no-cli-pager
}

function delete_stack_and_wait() {
    delete_stack "$1" "$2"
    aws cloudformation wait stack-delete-complete --stack-name "$1" --no-cli-pager
}

function list_edge_functions() {
    aws lambda list-functions --region us-east-1 --no-cli-pager | jq '.Functions | .[].FunctionName ' | grep Edge | tr -d '"'
}

function edge_function_cleanup {
    log "Edge functions cleanup"
    while read line;do
        echo "> Trying to delete $line"
        aws lambda delete-function --region us-east-1 --function-name $line --no-cli-pager
    done< <(list_edge_functions)
}

function deleteEdgeStack() {
    main_stack="$environment"-MainStack
    edge_support_stack=$main_stack-support-lambda-at-edge
    delete_stack_and_wait $edge_support_stack us-east-1
    sleep 20
    deleteEdgeStackRetainingFunctions
}

function deleteEdgeStackRetainingFunctions() {
    stackEvents=$(aws cloudformation describe-stack-events --region $edge_region --stack-name $edge_support_stack)
    lastStatus=$(echo $stackEvents| jq ' .StackEvents |.[0] | .ResourceStatus' | tr -d '"')

    if [ "$lastStatus" = 'DELETE_FAILED' ];then
        resource=$(echo $stackEvents| jq ' .StackEvents |.[0] | .ResourceStatusReason' | cut -d'[' -f2 | cut -d']' -f1 | tr -d ',' )
        aws cloudformation delete-stack --region us-east-1 --stack-name $edge_support_stack --retain $resource --no-cli-pager
    fi
}

deleteEdgeStack