benkehoe / aws-sso-util

Smooth out the rough edges of AWS SSO (temporarily, until AWS makes it better).
Apache License 2.0
973 stars 72 forks source link

Enabling Child Stacks #64

Closed awslyall closed 2 years ago

awslyall commented 2 years ago

We have been successfully using aws sso util for almost a year however the number of assignments has now grown to almost 500. Due to this we understand that we now need to enable the creation of child stacks. The error indicating the need to enable child stack can be seen in the Cloudwatch logs.

The macro has also been updated to the most current version.

We have tried both of the documented methods for creating child stack independently by setting the appropriate environment variable on the lambda function. 1) set NUM_CHILD_STACKS (eg to 10) 2) set MAX_ASSIGNMENTS_ALLOCATION (eg to 1000)

For the other environment variables they are all set to default bar CHILD_TEMPLATES_IN_YAML and LOOKUP_NAMES which are both set to TRUE.

We are seeing the below error in the Cloud Formation events when we try to run the macro with child stacks. This error is consistent even when we are trying to create resources with under 500 assignments. There are no obvious errors in the Cloudwatch logs

S3 error: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint. For more information check http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html

Please can you assist?

awslyall commented 2 years ago

Just as a small update, I am seeing the same behaviour if CHILD_TEMPLATES_IN_YAML and LOOKUP_NAMES are set to their default values of False

benkehoe commented 2 years ago

This doesn't seem like it's a problem with those parameters, but with the macro itself. What's in the CloudWatch Logs for the macro lambda function?

awslyall commented 2 years ago

Hi

Thank you for getting back to me.

I had completed some cleanup of unecessary SSO assignments, so I implemented a breaking change (assignments > 500) to gather the required detail.

on the first run i had both NUM_CHILD_STACKS and MAX_ASSIGNMENTS_ALLOCATION set to -1, the cloud watch logs are below


timestamp message
1651562314742 START RequestId: ff11216a-28f1-40dc-a616-29925aa86bd8 Version: $LATEST
1651562314751 [INFO] 2022-05-03T07:18:34.751Z ff11216a-28f1-40dc-a616-29925aa86bd8 Initializing handler
1651562314997 [INFO] 2022-05-03T07:18:34.997Z ff11216a-28f1-40dc-a616-29925aa86bd8 Extracting resources from template
1651562325068 [INFO] 2022-05-03T07:18:45.068Z ff11216a-28f1-40dc-a616-29925aa86bd8 Generated 451 assignments from 60 resources
1651562354887 [ERROR] 2022-05-03T07:19:14.886Z ff11216a-28f1-40dc-a616-29925aa86bd8 An error occurred: Too many assignments (3) to fit into template, specify a number of child stacks Traceback (most recent call last): File "/var/task/aws_sso_util/cfn_lib/macro.py", line 257, in handler parent_template = templates.resolve_templates( File "/var/task/aws_sso_util/cfn_lib/templates.py", line 282, in resolve_templates raise ValueError(f"Too many assignments ({len(assignments)}) to fit into template, specify a number of child stacks") ValueError: Too many assignments (3) to fit into template, specify a number of child stacks
1651562354898 END RequestId: ff11216a-28f1-40dc-a616-29925aa86bd8
1651562354898 REPORT RequestId: ff11216a-28f1-40dc-a616-29925aa86bd8 Duration: 40153.84 ms Billed Duration: 40154 ms Memory Size: 1024 MB Max Memory Used: 76 MB Init Duration: 729.76 ms

I then ran the transform again, but this time i set the MAX_ASSIGNMENTS_ALLOCATION to 1000 by editing the environment variable on the lambda function, the cloudwatch logs are below.


timestamp message
1651562849424 START RequestId: 49d26797-3368-4eb9-86f6-6a4d2fc3c9d6 Version: $LATEST
1651562849432 [INFO] 2022-05-03T07:27:29.432Z 49d26797-3368-4eb9-86f6-6a4d2fc3c9d6 Initializing handler
1651562849690 [INFO] 2022-05-03T07:27:29.689Z 49d26797-3368-4eb9-86f6-6a4d2fc3c9d6 Extracting resources from template
1651562859640 [INFO] 2022-05-03T07:27:39.639Z 49d26797-3368-4eb9-86f6-6a4d2fc3c9d6 Generated 451 assignments from 60 resources
1651562869291 [INFO] 2022-05-03T07:27:49.291Z 49d26797-3368-4eb9-86f6-6a4d2fc3c9d6 Writing 87 child templates
1651562872554 END RequestId: 49d26797-3368-4eb9-86f6-6a4d2fc3c9d6
1651562872554 REPORT RequestId: 49d26797-3368-4eb9-86f6-6a4d2fc3c9d6 Duration: 23130.30 ms Billed Duration: 23131 ms Memory Size: 1024 MB Max Memory Used: 83 MB Init Duration: 706.48 ms

As you can see for the second run I do not see any errors in the Cloudwatch logs. The only error can be seen in the Cloudformation/Stack/Events - "S3 error: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint. For more information check http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html"

In terms of other information that maybe useful.

awslyall commented 2 years ago

Using the CLI, to describe the Cloudformation events, I can see

{ "StackId": "arn:aws:cloudformation:eu-west-1:12345678910:stack/sso-config-stack-12345678910-eu-west-1/9313dcf0-f693-11eb-b31d-066b77039e79", "EventId": "ExampleAssignmentGroup000-CREATE_FAILED-2022-05-10T04:45:30.854Z", "StackName": "sso-config-stack-12345678910-eu-west-1", "LogicalResourceId": "ExampleAssignmentGroup000", "PhysicalResourceId": "", "ResourceType": "AWS::CloudFormation::Stack", "Timestamp": "2022-05-10T04:45:30.854000+00:00", "ResourceStatus": "CREATE_FAILED", "ResourceStatusReason": "S3 error: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.\nFor more information check http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html", "ResourceProperties": "{\"TemplateURL\":\"https://s3.amazonaws.com/aws-sso-util-macroartifactbucket-abcdefghi/templates/2022-05-10T04:44_1d58d5fa-8c3a-415b-97aa-9231ef533bbd/ExampleAssignmentGroup/ExampleAssignmentGroup-000.json\",\"Parameters\":{\"InstanceArn\":\"arn:aws:sso:::instance/ssoins-10987654321\",\"ReadOnlyPermissionSet\":\"arn:aws:sso:::permissionSet/ssoins-10987654321/ps-71b7e056cb4952d5\"}}" }, { "StackId": "arn:aws:cloudformation:eu-west-1:12345678910:stack/sso-config-stack-12345678910-eu-west-1/9313dcf0-f693-11eb-b31d-066b77039e79", "EventId": "ExampleAssignmentGroup000-CREATE_IN_PROGRESS-2022-05-10T04:45:29.208Z", "StackName": "sso-config-stack-12345678910-eu-west-1", "LogicalResourceId": "ExampleAssignmentGroup000", "PhysicalResourceId": "", "ResourceType": "AWS::CloudFormation::Stack", "Timestamp": "2022-05-10T04:45:29.208000+00:00", "ResourceStatus": "CREATE_IN_PROGRESS", "ResourceProperties": "{\"TemplateURL\":\"https://s3.amazonaws.com/aws-sso-util-macroartifactbucket-abcdefghi/templates/2022-05-10T04:44_1d58d5fa-8c3a-415b-97aa-9231ef533bbd/ExampleAssignmentGroup/ExampleAssignmentGroup-000.json\",\"Parameters\":{\"InstanceArn\":\"arn:aws:sso:::instance/ssoins-10987654321\",\"ReadOnlyPermissionSet\":\"arn:aws:sso:::permissionSet/ssoins-10987654321/ps-71b7e056cb4952d5\"}}" },

benkehoe commented 2 years ago

I see the template URL is not being accessed using virtual host style URLs. If you try to access that URL directly (e.g., with the AWS CLI aws s3 cp) does it work? If you change it to virtual host style, does it work?

benkehoe commented 2 years ago

Can you follow the macro deploy instructions but after cloning the repo, update this line to

f"https://{BUCKET_NAME}.s3.amazonaws.com/{s3_base_path}",

and see if that makes it work?

awslyall commented 2 years ago

Hi Ben

Thank you for reaching out, I apologise for not coming back to you sooner, but I was taking a few rest days.

Here is a summary of the actions I took

  1. Rredeployed existing SSO configuration to confirm there was no existing error. (we have 432 assignments and permission sets deployed in the stack)
  2. Emptied sam-cli bucket
  3. Emptied aws sso microartifact bucket
  4. Deleted stack - aws-sam-cli-managed-default
  5. Deleted stack - aws-sso-util
  6. Delete local copy of aws sso util
  7. Cloned a new copy of aws sso util from github
  8. Updated line as per your instructions (f"https://{BUCKET_NAME}.s3.amazonaws.com/{s3_base_path}",)
  9. Removed any existing dockers images associated with aws sso util
  10. Redeployed application sam build --use-container sam deploy --guided --capabilities CAPABILITY_NAMED_IAM

samconfig.toml version = 0.1 [default] [default.deploy] [default.deploy.parameters] stack_name = "aws-sso-util" s3_bucket = "aws-sam-cli-managed-default-samclisourcebucket-123456789" s3_prefix = "aws-sso-util" region = "eu-west-1" confirm_changeset = true capabilities = "CAPABILITY_NAMED_IAM" parameter_overrides = "NumChildStacks=\"-1\" MaxAssignmentsAllocation=\"-1\" LookupNames=\"false\" DefaultSessionDuration=\"\" ChildTemplatesInYaml=\"false\" MaxConcurrentAssignments=\"-1\" MaxResourcesPerTemplate=\"-1\" LogLevel=\"INFO\" ArtifactS3KeyPrefix=\"\" S3PutObjectArgs=\"\"" image_repositories = []

  1. Redeployed existing SSO configuration to confirm there was no existing error. (as per step 1)

At this point everything looked good and I couldnt see any issues


timestamp message
1653375086498 START RequestId: 158c74a2-5e7a-4b3f-be84-01a8b269ba5a Version: $LATEST
1653375086508 [INFO] 2022-05-24T06:51:26.507Z 158c74a2-5e7a-4b3f-be84-01a8b269ba5a Initializing handler
1653375086770 [INFO] 2022-05-24T06:51:26.770Z 158c74a2-5e7a-4b3f-be84-01a8b269ba5a Extracting resources from template
1653375096879 [INFO] 2022-05-24T06:51:36.879Z 158c74a2-5e7a-4b3f-be84-01a8b269ba5a Generated 393 assignments from 60 resources
1653375150494 [INFO] 2022-05-24T06:52:30.494Z 158c74a2-5e7a-4b3f-be84-01a8b269ba5a Writing 0 child templates
1653375151932 END RequestId: 158c74a2-5e7a-4b3f-be84-01a8b269ba5a
1653375151932 REPORT RequestId: 158c74a2-5e7a-4b3f-be84-01a8b269ba5a Duration: 65432.34 ms Billed Duration: 65433 ms Memory Size: 1024 MB Max Memory Used: 88 MB Init Duration: 803.65 ms

awslyall commented 2 years ago

Now for the next piece I changed the MAX_ASSIGNMENTS_ALLOCATION environment variable on the lambda to 1000 and redeployed the same known good configuration as step 1 from my previous post.

Now I see a failure

Cloudwatch Logs

timestamp message
1653376356370 START RequestId: 3791d70c-5ebe-4f54-ad9f-6f17d609c8b6 Version: $LATEST
1653376356379 [INFO] 2022-05-24T07:12:36.379Z 3791d70c-5ebe-4f54-ad9f-6f17d609c8b6 Initializing handler
1653376356679 [INFO] 2022-05-24T07:12:36.678Z 3791d70c-5ebe-4f54-ad9f-6f17d609c8b6 Extracting resources from template
1653376367123 [INFO] 2022-05-24T07:12:47.123Z 3791d70c-5ebe-4f54-ad9f-6f17d609c8b6 Generated 393 assignments from 60 resources
1653376377693 [INFO] 2022-05-24T07:12:57.693Z 3791d70c-5ebe-4f54-ad9f-6f17d609c8b6 Writing 85 child templates
1653376381840 END RequestId: 3791d70c-5ebe-4f54-ad9f-6f17d609c8b6
1653376381840 REPORT RequestId: 3791d70c-5ebe-4f54-ad9f-6f17d609c8b6 Duration: 25468.69 ms Billed Duration: 25469 ms Memory Size: 1024 MB Max Memory Used: 85 MB Init Duration: 1019.53 ms

When viewing the Stack

022-05-24 08:14:18 UTC+0100 | sso-config-stack123456789-eu-west-1 | UPDATE_ROLLBACK_COMPLETE | - -- | -- | -- | -- 2022-05-24 08:14:18 UTC+0100 | SREDevAssignmentGroup000 | DELETE_COMPLETE | - 2022-05-24 08:14:17 UTC+0100 | sso-config-stack-123456789-eu-west-1 | UPDATE_ROLLBACK_COMPLETE_CLEANUP_IN_PROGRESS | - 2022-05-24 08:14:03 UTC+0100 | sso-config-stack-123456789-eu-west-1 | UPDATE_ROLLBACK_IN_PROGRESS | The following resource(s) failed to create: [SREDevAssignmentGroup000]. 2022-05-24 08:13:31 UTC+0100 | SREDevAssignmentGroup000 | CREATE_FAILED | S3 error: Access Denied For more information check http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html 2022-05-24 08:13:30 UTC+0100 | SREDevAssignmentGroup000 | CREATE_IN_PROGRESS | - 2022-05-24 08:13:08 UTC+0100 | sso-config-stack-123456789-eu-west-1 | UPDATE_IN_PROGRESS | Transformation succeeded 2022-05-24 08:12:31 UTC+0100 | sso-config-stack-123456789-eu-west-1 | UPDATE_IN_PROGRESS | User Initiated 2022-05-24 08:02:06 UTC+0100 | sso-config-stack-123456789-eu-west-1 | UPDATE_COMPLETE | - 2022-05-24 08:02:04 UTC+0100 | sso-config-stack123456789-eu-west-1 | UPDATE_COMPLETE_CLEANUP_IN_PROGRESS | - 2022-05-24 08:01:41 UTC+0100 | CloudEngineerStagePermissionSet | UPDATE_COMPLETE | -

Please note I have also partially copied in the end of the update on the previous stack (my previous post) which was successful

awslyall commented 2 years ago

in terms of accessing the bucket using the cli, then this command works without error aws s3 cp s3://aws-sso-util-macroartifactbucket-abcdefghi/templates/2022-05-24T07:12_3c605be2-339f-4ba2-84fd-355f270a4cd6/AllAccountProductManagersAssignmentGroup/AllAccountProductManagersAssignmentGroup-000.json .

Result of command download: s3://aws-sso-util-macroartifactbucket-abcdefghi/templates/2022-05-24T07:12_3c605be2-339f-4ba2-84fd-355f270a4cd6/AllAccountProductManagersAssignmentGroup/AllAccountProductManagersAssignmentGroup-000.json to .\AllAccountProductManagersAssignmentGroup-000.json

For this I am using temporary command line credentials supplied to me from the AWS SSO Landing page. My user account has full administrative permissions in the account

awslyall commented 2 years ago

Please close this request. I have resolved the issue by moving the configuration to Terraform.

@benkehoe - Thank you for both your time in and efforts in producing AWS SSO Util and the troubleshooting attempts. I wish you all the best for the future. Please continue to develop this tool as there is still a significant need.