aws-samples / aws-iam-identity-center-extensions

This solution is intended for enterprises that need a streamlined way of managing user access to their AWS accounts. Using this solution, your identity and access management teams can extend AWS SSO functionality by automating common access management and governance use cases
MIT License
65 stars 24 forks source link

ThrottlingException #87

Closed jjleigh closed 2 years ago

jjleigh commented 2 years ago

Receiving throttling exception in these 2 situations:

This was tried on an OU with 72 accounts but only 40 were provisioned.

Here is part of the error bellow:

Subject: Error Processing link provisioning operation

"eventSource":"aws:sqs","eventSourceARN":"arn:aws:sqs:us-east-:env-linkManagerQueue.fifo","awsRegion":"us-east-1"}]},"errorDetails":{"name":"ThrottlingException","$fault":"client","$metadata":{"httpStatusCode":400,"requestId":"6cd844bc-a51b-414d-89f9-c9723df46bcb","attempts":2,"totalRetryDelay":175},"__type":"ThrottlingException","message":"There are too many requests processing. Please try again later."}}

allquixotic commented 2 years ago

This is affecting my organization in a PROD deploy right now. It's not the same organization as the one this issue was reported for.

leelalagudu commented 2 years ago

@allquixotic , @jjleigh - ACK on the issue. For context, this is how the solution should behave:

I believe the missing part here is a mandatory wait enforcement between each pagination operation in the state machine. Because this was missing, the state machine keeps processing the pages and overloads the FIFO queue. We did enforce this pattern on the current config import/ region switch state machines.

We will amend the state machines to fit in with this behaviour and ask @jmejco / @tamara-h to validate on a demo set up that has more than 50 accounts under an account_tag/ou_id/root scope and check if the enforced wait between pages resolves the issue. We're on public holiday here in the UK until 5th June and as such they could validate the behaviour on 6th June.

Hope this helps, Leela

allquixotic commented 2 years ago

I set sleep statements in my Directory Service to SSO migration code that slow down the import and worked around this for my purposes, but it should definitely be resolved in the SSO Extensions code.

jjleigh commented 2 years ago

@leelalagudu Thank you for the update!

leelalagudu commented 2 years ago

Thanks for the udpate @allquixotic , at least this confirms my hypothesis theoretically. As updated, we will go ahead with this design change and load test the solution as part of the PR release.

leelalagudu commented 2 years ago

@jjleigh , @allquixotic , mandatory wait enforcement between each page is now in the solution through #89 . This should handle the throttling exceptions you are seeing.

Please do let us know if this fixed your issue, Leela