noqdev / iambic

IAMbic is Version-Control for IAM. It centralizes and simplifies cloud access and permissions. It maintains an eventually consistent, human-readable, bi-directional representation of IAM in Git.
https://iambic.org
Apache License 2.0
285 stars 26 forks source link

IAMbic AWS plugin does not play well with low ulimit value #386

Closed smoy closed 1 year ago

smoy commented 1 year ago

Describe the bug macOS default ulimit -n 256 is a problem when an AWS org contains a lot of accounts.

To Reproduce Steps to reproduce the behavior:

  1. Ensure you have ulimit -n 256 in your environment. Since IAMbic use multiprocessing, you should ensure your shell is not changing the ulimit value.
  2. in a new current working directory, go through iambic setup
  3. setup using AWS organization flow (ensure your organization have like 9+ accounts)
  4. See error
2023/05/04 10:07:00 [info     ] Beginning to retrieve AWS Identity Center Permission Sets. 
  org_accounts=[
    "REACTED_ORG_ACCOUNT"
  ]
2023/05/04 10:07:00 [info     ] Setting inline policies in role templates 
  accounts=[
    "REACTED_ACCOUNT_N_MINUS_5"
  ]
2023/05/04 10:07:00 [info     ] Setting inline policies in role templates 
  accounts=[
    "REACTED_ACCOUNT_N_MINUS_4"
  ]
2023/05/04 10:07:00 [info     ] Setting inline policies in role templates 
  accounts=[
    "REACTED_ACCOUNT_N_MINUS_3"
  ]
2023/05/04 10:07:00 [info     ] Setting inline policies in role templates 
  accounts=[
    "REACTED_ACCOUNT_N_MINUS_2"
  ]
2023/05/04 10:07:01 [info     ] Setting inline policies in role templates 
  accounts=[
    "REACTED_ACCOUNT_N_MINUS_1"
  ]
2023/05/04 10:07:02 [info     ] Setting inline policies in role templates 
  accounts=[
    "REACTED_ACCOUNT_N"
  ]
2023/05/04 10:07:04 [info     ] Failed to refresh AWS accounts 
  error=OSError(24, 'Too many open files')
? What would you like to configure in AWS? (Use arrow keys)

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Current workaround requires you to change your ulimit

(env) stevenmoy@steven-noqdev-mbp iambic % ulimit -n
256
(env) stevenmoy@steven-noqdev-mbp iambic % ulimit -n 1024
(env) stevenmoy@steven-noqdev-mbp iambic % ulimit -n     
1024
perpil commented 1 year ago

There are environment variables that can tell whether you are running in a github workflow or not. Instead of changing the ulimit, I'd suggest lowering the resource consumption by default. That could be as simple as changing the connection pool size when instantiating the clients. I think it is plenty fast, so lowering it to work with the constraints of the OS defaults makes sense vs. trying to change the ulimit. If it is running in github, you could automatically raise the size of the connection pool based on it detecting it is running in a runner, or make it a command line flag.

smoy commented 1 year ago

That's good suggestion. I think we need a refactor the current boto3 client logic across AWS accounts. The implementation today assume it gets a client, open a file, it will succeed. There is no resource manger interface or queue on the interfaces. I will write up a feature enhancement proposal how we want to do it.

What I mean by that is if the host process has only N available fd. The implementation does not really take it into consider, if it needs N+1 FD, it will just crash.

smoy commented 1 year ago

I open https://github.com/noqdev/iambic/issues/391 to track the work required to refactor the implementation.

smoy commented 1 year ago

https://github.com/noqdev/iambic/pull/387 addressed the user experience, so user is not fighting ulimit. But #391 should get looked at to play nice with resource limit.