aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.47k stars 3.82k forks source link

Deployment fails with no actionable error feedback #29367

Open anentropic opened 5 months ago

anentropic commented 5 months ago

Describe the bug

When I try to deploy my stack I get the following error:

✨  Synthesis time: 137.93s

mystack-website-eu-qa: deploying... [1/1]
mystack-website-eu-qa: creating CloudFormation changeset...

 ❌  mystack-website-eu-qa failed: Error: The stack named mystack-website-eu-qa failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE
    at FullCloudFormationDeployment.monitorDeployment (/Users/anentropic/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:427:10615)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Object.deployStack2 [as deployStack] (/Users/anentropic/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:430:196919)
    at async /Users/anentropic/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:430:178888

 ❌ Deployment failed: Error: The stack named mystack-website-eu-qa failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE
    at FullCloudFormationDeployment.monitorDeployment (/Users/anentropic/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:427:10615)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Object.deployStack2 [as deployStack] (/Users/anentropic/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:430:196919)
    at async /Users/anentropic/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:430:178888

The stack named mystack-website-eu-qa failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE

Expected Behavior

It either just works (better) or, if it fails, it returns some information about what went wrong (acceptable)

Current Behavior

Notably the message contains no information about what went wrong. Only that the tool may have also left some crap lying around that I have to clean up manually.

Reproduction Steps

I doubt it is possible to make a minimal reproduction, I certainly don't have the time or resources to whittle down my stack in a separate environment, sorry 😢

Possible Solution

presumably somewhere in the region of the code in the traceback cdk is swallowing the actual error and replacing it with this generic message

so maybe don't do that

Additional Information/Context

No response

CDK CLI Version

2.131.0 (build 92b912d)

Framework Version

No response

Node.js Version

v18.18.0

OS

macOS 14.3.1

Language

Python

Language Version

3.11.5

Other information

No response

anentropic commented 5 months ago

Some more info...

By making other changes to my stack I was able to get one of these panels, with a confirmation prompt:

IAM Statement Changes
┌───┬───────────────────────────────────┬────────┬─────────────────┬───────────────────────────────────┬─────────────────────────────────────┐
│   │ Resource                          │ Effect │ Action          │ Principal                         │ Condition                           │
├───┼───────────────────────────────────┼────────┼─────────────────┼───────────────────────────────────┼─────────────────────────────────────┤
│ + │ ${LogsBucket.Arn}                 │ Allow  │ s3:GetBucketAcl │ Service:delivery.logs.amazonaws.c │                                     │
│   │                                   │        │                 │ om                                │                                     │
└───┴───────────────────────────────────┴────────┴─────────────────┴───────────────────────────────────┴─────────────────────────────────────┘
(NOTE: There may be security-related changes not in this list. See https://github.com/aws/aws-cdk/issues/1299)

Do you wish to deploy these changes (y/n)?

After confirming I do then see a meaningful error displayed briefly, while the deployment is still in progress.

But, after it finishes, the progress text is cleared away and all I see is the generic message from my original post.

I think this is the root of the problem behaviour - it is unhelpful that these "in progress" error messages (which are the ones that contain the actual error info) get cleared away after the deploy finishes failing - so you have to watch the deploy like a hawk to spot an error that is only displayed for a few seconds.

And also they are not displayed at all in the case that there is no "IAM Statement Changes" confirmation prompt.

pahud commented 5 months ago

Are you able to scroll up and see the deploy error messages?

You may also try cdk deploy -R or cdk deploy --no-rollback and you should see everything stops on the failed deployment for your troubleshooting.

anentropic commented 5 months ago

Are you able to scroll up and see the deploy error messages?

No, that's what I'm saying - there's nothing to scroll up to, they get overwritten in the terminal during the deploy process. They are only shown as sort of progress message for individual steps of the deployment, but when it finishes that gets wiped away and just a generic summary is displayed.

anentropic commented 4 months ago

Here is a video demonstrating the problem in action...

https://github.com/aws/aws-cdk/assets/147840/f8ee6b56-cfcd-4f38-a447-3c43a27cdcb4

The real error ("Export with name ifm-ssa-loadbalancer-dns-name-qa-eu is already exported by stack") is only displayed briefly before it disappears and is replaced by a generic stack trace. In fact I had to capture the screen recording to get a good look at the error.

IkeNefcy commented 4 months ago

Does CFN show an error? I'm assuming the ask is that you shouldn't have to look at CFN in the first place, but wondering if there is an error or not because if not maybe it's not CDK at fault.

anentropic commented 4 months ago

Does CFN show an error? I'm assuming the ask is that you shouldn't have to look at CFN in the first place, but wondering if there is an error or not because if not maybe it's not CDK at fault.

Unfortunately I've since destroyed and redeployed the stack so it's hard to know

But I would assume that cdk deploy is making cloud formation apt requests under the hood, rather than shelling out to another tool?

Seems to me the issue is with how error output is printed to the terminal rather than source of the error