aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.72k stars 3.94k forks source link

aws_kinesis: Stream default removal policy changed in v2.140.0 #30562

Open rittneje opened 5 months ago

rittneje commented 5 months ago

Describe the bug

30037 added a new removal_policy parameter the the aws_kinesis.Stream constructor.

This was a breaking change. Previously it defaulted to DESTROY but now it defaults to RETAIN. This causes all of our stacks to leak their Kinesis streams upon deletion.

This change was not mentioned in the release notes.

Expected Behavior

  1. The default value of removal_policy should have remained DESTROY.
  2. This change should have been mentioned in the release notes for v2.140.0.

Current Behavior

It defaults to RETAIN causing all Kinesis streams to leak.

Reproduction Steps

  1. Create a stack with a Kinesis stream under v2.137.0. Then delete it. Observe the Kinesis stream is deleted.
  2. Deploy the exact same stack under v2.143.0 without changing the cdk code. Then delete it. Observe the Kinesis stream is retained.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.143.0 (build 9f2bdf7)

Framework Version

No response

Node.js Version

v22.2.0

OS

Alpine 3.20

Language

Python

Language Version

Python 3.12.3

Other information

No response

pahud commented 5 months ago

I see this

https://github.com/aws/aws-cdk/pull/30037/commits/24202124a6ca6e3605f801b9a30b6dd0dbdf5878

Looks like this would start applying removal policy based on props.removalPolicy which defaults to RETAIN and before that, no removalPolicy is applied.

Did you explicitly apply the policy to removal with props.removalPolicy undefined before that PR? Looks like in prior to this PR, the Stream construct would not apply any removalPolicy for the resource?

rittneje commented 5 months ago

@pahud previously, removal_policy was not a constructor property, and we were not calling apply_removal_policy. Thus the CloudFormation template had no DeletionPolicy or UpdateReplacePolicy, which from what I can tell then defaults to “Delete” within CloudFormation.

colifran commented 5 months ago

@rittneje Adding removalPolicy as a constructor prop happened because Stream was updated in the L1 CloudFormation spec update to be a stateful resource - https://github.com/cdklabs/awscdk-service-spec/pull/1056. We have rules set up in our codegen that will fail the build if the removalPolicy is not explicitly set on a stateful resource. Why was the default retain? We advertise that all CDK stateful resources have a removalPolicy as retain by default in our developer guide - https://docs.aws.amazon.com/cdk/v2/guide/resources.html#resources_removal. We need to maintain consistency here because not doing so invalidates what we advertise in our developer guide and introduces inconsistencies. Further discussion on this topic can be found in this PR - https://github.com/aws/aws-cdk/issues/12563 and this PR - https://github.com/aws/aws-cdk/pull/12920. The second of those took the same action, i.e., add the default removalPolicy as retain to stateful resources where this wasn't previously set explicitly. I'm sorry we didn't make this more apparent, but in order to get the build to pass as a a result of Stream being stateful we need to make the change in the PR itself.

rittneje commented 5 months ago

@pahud Where does CDK define the set of "stateful resources"? Evidently this is not a CloudFormation concept, since it does not give Kinesis streams a special default DeletionPolicy/UpdateReplacePolicy.

Also, going forward this (or any default value change) really needs to show up in the release notes as an explicit point, not hidden inside an innocuous-sounding item like "update L1 CloudFormation resource definitions".

colifran commented 5 months ago

@rittneje I agree with you. We need to come up with a better way to consolidate and list what is defined as a stateful resource. As of right now I'm not completely sure why Stream was changed to be stateful. I do see that they can be stateful but I'm not sure what changed made this suddenly get picked up - https://docs.aws.amazon.com/lambda/latest/dg/services-kinesis-windows.html. One of the workflows that runs in the awscdk-service-spec repository is to update stateful resources and Stream was marked when from isStateful: undefined to isStateful: true. I think at this point a CDK notice could bridge the gap since this was hidden behind "update L1 CloudFormation resource definitions".