aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.68k stars 3.93k forks source link

(cli): changeset-based `cdk diff` not showing changes caused by SSM parameters, but appear in CloudFormation UI changeset info #29731

Open blimmer opened 7 months ago

blimmer commented 7 months ago

Describe the bug

Currently if you use a Systems Manager StringParameter as a value to another resource, the cdk diff does not identify a diff when the underlying parameter changes, even when using the new behavior that produces a CloudFormation changeset (vs. template-only diffs). The new behavior to produce a real changeset is designed to identify these types of changes.

If I manually create a changeset in the console, it does identify the change, so this feels like an issue specific to the CDK changeset diffing behavior.

Expected Behavior

I expect to be notified that my stack will change, just like I am if I upload the template to the CloudFormation UI. This bug is especially concerning if the change could trigger an unexpected resource replacement (which makes me think this should be a P1 issue).

Current Behavior

cdk diff says There were no differences when, really, CloudFormation generates a changeset that shows a diff.

Reproduction Steps

Consider the following straightforward stack:

import * as cdk from 'aws-cdk-lib';
import { Queue } from 'aws-cdk-lib/aws-sqs';
import { StringParameter } from 'aws-cdk-lib/aws-ssm';
import { Construct } from 'constructs';

export class CdkBugReportStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const queueNameFromParameter = StringParameter.valueForStringParameter(this, '/cdk/test/queue-name-param');
    new Queue(this, "Queue", {
      queueName: queueNameFromParameter
    })
  }
}

It imports an SSM StringParameter and uses the resolved value to name the queue.

  1. Create an SSM StringParameter with an initial value. For this repro, I'll call it blimmer-test-1.

Screenshot_2024-04-04_at_14_31_24

  1. Deploy the example stack above (cdk deploy CdkBugReportStack). You'll see your queue is cre ated with the name of your StringParameter from step 1.

Screenshot_2024-04-04_at_14_32_49

  1. Now, edit the StringParameter from step 1. I updated the value from blimmer-test-1 to blimmer-test-2.

Screenshot_2024-04-04_at_14_33_04

  1. Run cdk diff CdkBugReportStack. Make sure you're using the latest CDK (at time of writing v2.135.0) and that you're generating a changeset for the diff (e.g., not passing --no-change-set).

    > npx cdk diff CdkBugReportStack
    Stack CdkBugReportStack
    Hold on while we create a read-only change set to get a diff with accurate replacement information (use --no-change-set to use a less accurate but faster template-only diff)
    There were no differences
    
    ✨  Number of stacks with differences: 0

    As you can see, no diffs are identified (even though the underlying parameter did change).

  2. Generate the CloudFormation stack for use in the console in the next steps:

    > npx cdk synth -j CdkBugReportStack > stack.json
  3. Visit the CloudFormation service page in the AWS Console. Select the stack, CdkBugReportStack.

  4. Choose "Stack Actions" -> "Create change set for current stack".

Screenshot_2024-04-04_at_14_49_59

  1. Replace the template with the one you generated in step 5.

Screenshot_2024-04-04_at_14_51_24

  1. Keep clicking "next" and, finally, submit, without changing any other values.
  2. Now, once the changeset is rendered, you'll see that it identifies that the SSM parameter has changed and will result in a replacement of the queue. This is what I expect aws-cdk to report, too.

Screenshot_2024-04-04_at_14_56_05

  1. Now you can also run npx cdk deploy CdkBugReportStack and you'll see that the queue is replaced, even though the diff said no changes were detected.
> npx cdk deploy

✨  Synthesis time: 1.84s

CdkBugReportStack:  start: Building 3514dbc73317309e7175dd9ffc618c6a7b4f814c0383edf38308e14ddaf77d81:current_account-current_region
CdkBugReportStack:  success: Built 3514dbc73317309e7175dd9ffc618c6a7b4f814c0383edf38308e14ddaf77d81:current_account-current_region
CdkBugReportStack:  start: Publishing 3514dbc73317309e7175dd9ffc618c6a7b4f814c0383edf38308e14ddaf77d81:current_account-current_region
CdkBugReportStack:  success: Published 3514dbc73317309e7175dd9ffc618c6a7b4f814c0383edf38308e14ddaf77d81:current_account-current_region
CdkBugReportStack: deploying... [1/1]
CdkBugReportStack: creating CloudFormation changeset...

 ✅  CdkBugReportStack

✨  Deployment time: 43.33s

Stack ARN:
arn:aws:cloudformation:us-west-2:REDACTED:stack/CdkBugReportStack/6b883590-f2c2-11ee-afe3-025d26e8baa1

✨  Total time: 45.17s

Possible Solution

It feels like if CloudFormation can identify this change, CDK should also be able to identify the change when running the more accurate changeset-based diff.

Additional Information/Context

I was confused why my services were sometimes redeploying when no diffs were shown via cdk diff. It turned out that my problem was with obtainDefaultFluentBitECRImage, which obtains the fluent bit image via an SSM parameter (docs). When the underlying parameter changed, it caused my task definitions and services to be updated.

Linking this up with https://github.com/aws/aws-cdk/issues/7366 and https://github.com/aws/aws-cdk/issues/23288, which are related to the specific issue I ran into here.

CDK CLI Version

2.135.0 (build d46c474)

Framework Version

No response

Node.js Version

20 LTS

OS

MacOS

Language

TypeScript

Language Version

No response

Other information

No response

blimmer commented 7 months ago

cc @comcalvi who has been working on the CFN changeset diffs lately

pahud commented 7 months ago

Yes you are right. Looks like cdk diff would not detect the parameter value change but the CFN console would. Making it a p1 and we'll discuss with the team.

bergjaak commented 7 months ago

Using the example shared by @blimmer, the root cause appears to be that the DifferenceCollection.logicalIds (https://github.com/aws/aws-cdk/blob/main/packages/@aws-cdk/cloudformation-diff/lib/diff/types.ts#L406) doesn't contain all the changed resources. Before hitting this line (406) of code in cdk diff, we filter for replacements -- and we do find the SqsQueue replacement https://github.com/aws/aws-cdk/blob/main/packages/@aws-cdk/cloudformation-diff/lib/diff-template.ts#L233. So we should make the TemplateDiff.resources contain the changed resource.

Here are the debugger variables at line 233: Screenshot 2024-04-22 at 2 04 58 PM

github-actions[bot] commented 6 months ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

mbonig commented 1 month ago

Just here to 👍🏻 this issue as it is also a problem with EC2 instances where AMIs are often driven by SSM parameters (like new WindowsImage(WindowsVersion.WINDOWS_SERVER_2019_ENGLISH_FULL_BASE)) which causes a replacement of the EC2 instance despite cdk diff showing no changes.

kaizencc commented 1 month ago

https://github.com/aws/aws-cdk/pull/30093 fixed this but was reverted by https://github.com/aws/aws-cdk/pull/30243