aws-samples / aws-secure-environment-accelerator

The AWS Secure Environment Accelerator is a tool designed to help deploy and operate secure multi-account, multi-region AWS environments on an ongoing basis. The power of the solution is the configuration file which enables the completely automated deployment of customizable architectures within AWS without changing a single line of code.
Apache License 2.0
726 stars 233 forks source link

[BUG][Functional] Phase 2 Failure after account suspension #1229

Open para0056 opened 5 months ago

para0056 commented 5 months ago

Bug reports which fail to provide the required information will be closed without action.

Required Basic Info

Describe the bug State Machine failure at Phase 2 after suspending AWS Account when account with long name exists in the same OU.

Failure Info Error in Phase 2

Error: The stack named PBITAccel-WrkCcoeLakehouseItbstrategy1234-Phase2 failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "Security Group with App_sg already exists" (RequestToken: 6c5c80d7-902c-3328-9845-b423bc22f28b, HandlerErrorCode: AlreadyExists), Resource handler returned message: "Security Group with Web_sg already exists" (RequestToken: 9f965975-9037-8da8-84a7-abda4a8802f9, HandlerErrorCode: AlreadyExists), Resource handler returned message: "Security Group with Mgmt_sg already exists" (RequestToken: e21934d1-331a-d4ed-bd23-077c1dbb2ec2, HandlerErrorCode: AlreadyExists), Resource handler returned message: "Security Group with Data_sg already exists" (RequestToken: 0b9277f0-6c1e-a431-14b2-f6d5f301679d, HandlerErrorCode: AlreadyExists), Embedded stack arn:aws:cloudformation:ca-central-1:############:stack/PBITAccel-WrkCcoeLakehouseItbstrategy1234-P-SecurityGroupsTestShared2NestedStackSecuri-DLH5XBLW9U8P/96a21e70-21bc-11ef-879b-022ef219b75f was not successfully created: The following resource(s) failed to create: [SecurityGroupsSharedAccount2DataFBF92784, SecurityGroupsSharedAccount2Web23A19CB1, SecurityGroupsSharedAccount2Mgmt3C1EBEDE, SecurityGroupsSharedAccount2App7B023DB2].

Required files N/A

Steps To Reproduce Steps to reproduce:

  1. Create workload account in OU that has security groups defined in the ASEA config.json a. Ensure that the main SM runs
  2. Create new workload account with a long account name (at least 35 chars) in same OU as account created in step #1 a. E.g. workload-new-account-cra-test-1234 b. Ensure that the main SM runs
  3. Attempt to close account created in step 1 using documented process (https://aws-samples.github.io/aws-secure-environment-accelerator/latest/faq/#11-operational-activities)
  4. Once the account is suspended via AWS Organizations, and you run the main State Machine, it will fail at phase 2 a. The errors seem to indicate that it is trying to deploy another instance of a nested Phase 2 stack in long name account. Specifically, the nested stack responsible for creating the default SGs in the account (Web_sg, App_sg, Data_sg, Mgmt_sg) This stack fails because the SGs were already created in this account, and SG names must be unique in an account/region pair. To get around this error, we need to delete the SGs from account #1 and re-run the main SM.

Expected behavior The State Machine should run successfully after an account is suspended.

Screenshots N/A

Additional context N/A

archikierstead commented 4 months ago

Thank you for bringing this issue to our attention and making AWS aware of the problem. While we acknowledge its existence, we have not been able to reproduce it in any of our testing environments to date. We want to make you aware that the Landing Zone Accelerator for AWS (LZA), which you will be upgrading to, does not employ the same mechanisms that have name length constraints, so this issue will be resolved once the upgrade to LZA is complete. In the meantime, we will update the ASEA documentation to clarify the limit on account name lengths in the configuration file. Our current focus is on providing customers a smooth upgrade experience to LZA, which is currently in the testing phase, and we look forward to providing an update once that testing has concluded.