aws-cloudformation / cloudformation-coverage-roadmap

The AWS CloudFormation Public Coverage Roadmap
https://aws.amazon.com/cloudformation/
Creative Commons Attribution Share Alike 4.0 International
1.11k stars 54 forks source link

Unexpected stack update rollback failures with status UPDATE_ROLLBACK_FAILED if AWS::CloudFormation::StackSet TemplateURL is used without S3 object version details #983

Open sreshtk opened 2 years ago

sreshtk commented 2 years ago

Name of the resource

AWS::CloudFormation::StackSet

Resource Name

No response

Issue Description

When your stack contains a resource of type AWS::CloudFormation::StackSet with TemplateURL set to use an object from your S3 bucket, and if the rollback fails after an update failure, one of the possible causes can be if your bucket has more than 1 versions of the specified template file in TemplateURL and if that template has any conflicting issues. This is because, CloudFormation uses the current version of the template file at the time from S3 and not the version that worked previously before the update failure. This can cause rollback failures if the current template version at the time has any issues.

Expected Behavior

CloudFormation uses the previously worked template version for rollback instead of the current version at the time in the S3 bucket.

Observed Behavior

CloudFormation uses the current template version at the time in the S3 bucket for stack rollback instead of the previously worked template version.

Test Cases

  1. Use the following in AWS::CloudFormation::StackSet resource:

TemplateURL: 'https\://test-bucket.s3.amazonaws.com/mytemplate.yaml'

  1. Upload another version of the same template ( same file name ) with an obvious error in the file for visibility during rollback. Now, this version will become the current version in the S3 bucket.

  2. Use a different TemplateURL to trigger an update to the stack and make sure it fails.

Once the update fails, the stack rollback should fail as well reporting about the error that was intentionally added in step 2 and this indicates that CloudFormation used the current version of the template at the time from S3 bucket and not the previous one which worked ( that is not current at the time ).

Other Details

As a workaround, specified the intended S3 object version in property TemplateURL before performing an update so that CloudFormation knows which version to rollback to upon an update failure and not choose the current version at the time:

TemplateURL: 'https\://test-bucket.s3.amazonaws.com/mytemplate-version.yaml?versionId=wkRa43xZyvPSOXfv7NEPPvRjon2TR669'

xiwhuang commented 2 years ago

In the resource handler, we currently have no way to retrieve bucket version unless customer pass it in as part of TemplateURL.

It sounds more like a feature request as the current behavior is expected.

To be able to solve this, we would have to

  1. modify the permissions for resource handler so that it can fetch the existing bucket version
  2. break the handler contract as we might have to modify the TemplateURL which will be differ from what customer passes in.
owrm commented 2 years ago

The issue is that current behavior of AWS::Cloudformation::Stackset during rollback is different than AWS::Cloudformation::Stack resources. Both have a TemplateURL, but during rollback the Stack will use the previous version of the template that was used - whether this is done by referring to an older version of the object stored at the TemplateURL, or if the previous template is stored somewhere and pulled in during rollback is something I'm not aware of.

The Stackset, when rolling back, will always use the template contained at TemplateURL (no version in the url in our usecase). This causes problems when a Stackset resource update fails, which causes the stack the stackset is defined in to enter rollback, however the exact same template that caused the original update to fail is used for the rollback, so the stack enters rollback failed state.

We expect Stackset resources to revert the last known good template the same way the Stack resources do.