Closed jleopold28 closed 4 years ago
My gut reaction is that if you're running into this much rate limiting then you should treat the cause of that problem as opposed to treating the symptom being expressed by CloudMapper. What I mean by this is it would seem to indicate that you have something else running wild in your environment that needs to be better tuned, or you should talk with AWS about bumping up some type of limits as I've not heard of this being a problem for people.
@0xdabbad00
Scott,
I work with @jleopold28 (full disclosure, I'm also the maintainer of awslimitchecker) and have a bit more detail on our issues. We've worked with Enterprise Support, as well as some of the service teams, on these throttling issues multiple times. They do admit that it's a relatively rare problem, but we've had most of our API rate limits increased to the maximum. The one particular account and region where we're seeing this has over 1,000 Elastic Beanstalk environments and 3,500 CloudFormation Stacks. The vast majority of the API queries in this account are made by Beanstalk itself, not any third-party tooling (Beanstalk environments themselves - health checking, etc. - still count against API rate limiting).
Looking at https://github.com/boto/botocore/pull/1260 where botocore exposed max_retries
configuration to users via the Config object, and the issues and other PRs linked there, there's quite a bit of evidence there that other people also experience API rate limiting... though perhaps no other cloudmapper users do, or no other users have found enough value in cloudmapper to strongly desire a fix.
We'd be more than happy to open a pull request with a simple fix for this, likely via environment variables, but wanted to see if you have any feelings on implementation. If not, we'll end up just running off of a fork with this one fix. My hope is that a fix could be incorporated upstream just in case anyone else attempts to run cloudmapper in accounts that can't tolerate much fast listing of resources without hitting rate limits.
Thanks, Jason
Thank you for the explanation @jantman and thank you for pointing out that botocore has exposed the max_retries
config. I'll merge in a PR if you send one to expose a config option of some sort for this. I'm surprised boto doesn't just pull one from the environment, but given that they don't I think adding it as a command-line option to collect
would make sense in this part of the code: https://github.com/duo-labs/cloudmapper/blob/ecc8e0153a6366d04faecaa4897982943764568e/commands/collect.py#L476
Thanks so much, @0xdabbad00. We'll get to work on a PR for that.
@0xdabbad00 I have opened a PR to support boto max attempts. https://github.com/duo-labs/cloudmapper/pull/614
Thanks! I've merged it. I should be cutting a new release this week as I need to update the CDK for the nightly auditor.
I am running cloudmapper in a large AWS account that is constantly running close to the API limit. I am getting errors with the
collect
command that I believe would be fixed by increasing the boto retry limit. Is there a way I can increase the default retries of 4 with the current cloudmapper setup?We ran into a similar issue when running awslimitchecker and fixed the problem by increasing the retries. Here is a link to that https://github.com/jantman/awslimitchecker/pull/445/files#diff-ae1d60720c71355da3fddb7fa8bac222R101
I am willing to cut a PR with a change that would increase the botocore maxretries.
Here is some sanitized output of my issue: