aws-samples / aws-secure-environment-accelerator

The AWS Secure Environment Accelerator is a tool designed to help deploy and operate secure multi-account, multi-region AWS environments on an ongoing basis. The power of the solution is the configuration file which enables the completely automated deployment of customizable architectures within AWS without changing a single line of code.
Apache License 2.0
725 stars 233 forks source link

[BUG] [SM] Valid config.json (removed IAM users) failing on step function re-run #744

Closed caroseuk closed 3 years ago

caroseuk commented 3 years ago

Bug reports which fail to provide the required information will be closed without action.

Required Basic Info

Describe the bug When editing the config.json file and within the "management" account configuration, removing users from the iam block, the state machine fails. It appears that you must have a user being created for the state machine to succeed.

From looking at the code at the following location, it states that the iam object is optional, but even removing that completely from my config.json file it still fails.: https://github.com/aws-samples/aws-secure-environment-accelerator/blob/4b4bd995417560a552af43d888f28ad13c024716/src/lib/common-config/src/index.ts#L578

In the meantime I have had to pass in a user to be created for the step function to succeed.

Failure Info

{"acceleratorVersion":"1.3.3","Status":"FAILED","FailedState":"Deploy Phase 0","ExecutionArn":"arn:aws:states:eu-west-2:############:execution:Prefix-CodeBuild_sm:ba5e9d96-e451-4082-8183-39271a671f82","Input":{"codeBuildProjectName":"Prefix-DeployPrebuilt","environment":{"ACCELERATOR_PHASE":"0","ACCELERATOR_PIPELINE_ROLE_NAME":"Prefix-L-SFN-MasterRole-C4086704","ACCELERATOR_STATE_MACHINE_NAME":"PALZ-MainStateMachine_sm","CONFIG_BRANCH_NAME":"main","STACK_OUTPUT_TABLE_NAME":"PALZ-Outputs","BOOTSTRAP_STACK_NAME":"PALZ-CDKToolkit","CONFIG_COMMIT_ID":"7edc15d4994d1b8429a4db6ac299dff29b3d6afb","SCOPE":"FULL","MODE":"APPLY","INSTALLER_VERSION":"1.3.3","CONFIG_REPOSITORY_NAME":"PALZ-Config-Repo","CDK_DEBUG":"0","CONFIG_ROOT_FILE_PATH":"config.json","ACCELERATOR_BASELINE":"ORGANIZATIONS","CONFIG_FILE_PATH":"raw/config.json"}},"InputDetails":{"Included":true},"Name":"ba5e9d96-e451-4082-8183-39271a671f82","StartDate":1622206104187,"StateMachineArn":"arn:aws:states:eu-west-2:############:stateMachine
:PALZ-CodeBuild_sm","StopDate":1622206166085}

Required files

Steps To Reproduce

  1. Go to config.json locate the mandatory-account-configs > management block.
  2. ensure that the IAM block is empty
"management": {
      "account-name": "Test ASEA",
      "__LOAD": "global/primary-email.json",
      "ou": "Core",
      "src-filename": "config.json",
      "budget": {
        "__LOAD": "global/budgets.json"
      },
      "s3-retention": 180,
      "limits": {},
      "iam": {
        "users": [],
        "policies": [],
        "roles": []
      },
      "deployments": {}
    },
  1. Commit the new config.json and trigger a full re-run of the state machine

Expected behavior Any users created using ASEA would now be removed from the master account

Brian969 commented 3 years ago

Hi,

To troubleshoot the specific failure, we need the CloudWatch log from the Codebuild session (State Machine Phase) that failed. In that log, specifically if you search for the first (not last) occurrence of UPPERCASE "FAIL" - it normally leads to the exact cause of failure.

If I was to GUESS, you removed the users and policies at the same time. CloudFormation, which CDK synthesizes too, is not smart enough to reverse order of operations for remove versus create, so it is likely trying to delete the policy before the user that uses it has been removed, which is blocking the operation. (i.e. remove the users, run the SM, then remove the policy, run the SM, and may resolve your problem).

To simplify your config file, once complete you can drop to using the following with the same outcome:

"management": {
      "account-name": "Test ASEA",
      "__LOAD": "global/primary-email.json",
      "ou": "Core",
      "src-filename": "config.json",
      "budget": {
        "__LOAD": "global/budgets.json"
      },
      "s3-retention": 180,
    },