Open rbailey-godaddy opened 2 years ago
Even if I presume that each of those resources is scanned in parallel and independently hitting its own rate limit, the frequency of the announcements suggests whatever actual limit is being applied is far less than the claimed 15 seconds. (Or, many many pages of output later, 30 seconds, and then 45 seconds...)
The 15s is per API call, so if there are many services hitting the API rate limit, you'll indeed get a significant amount of messages. This is mostly due to AWS' horrible rate limiting implementation (per service, with a different quota for services and endpoints) and SS' architecture not fitting the per-service model.
this feels both abusive and inefficient.
100%, but AWS have been quite unresponsive (e.g. https://github.com/nccgroup/ScoutSuite/issues/91 & https://github.com/boto/boto3/pull/2086) to proposed strategies/remediation, and SS' architecture would require significant rewrites to handle these more gracefully. If they bothered implementing progressive backoff in boto3, we wouldn't have to worry about it.
2022-06-29 12:29:48 09e76605204d scout[154] INFO Fetching resources for the Secrets Manager service
2022-06-29 12:31:48 09e76605204d scout[154] INFO Hitting API rate limiting (facade/ec2.py L152), will retry in 15s
2022-06-29 12:32:06 09e76605204d scout[154] INFO Hitting API rate limiting (facade/ec2.py L152), will retry in 15s
2022-06-29 12:32:24 09e76605204d scout[154] INFO Hitting API rate limiting (facade/ec2.py L152), will retry in 15s
2022-06-29 12:32:25 09e76605204d scout[154] INFO Hitting API rate limiting (facade/ec2.py L152), will retry in 15s
2022-06-29 12:32:26 09e76605204d scout[154] INFO Hitting API rate limiting (facade/ec2.py L152), will retry in 15s
Are there any workarounds?
Are there any workarounds?
This is not a complete workaround but has a beneficial effect in nearly all of our environments:
# Enable API backoff and throttling
# "adaptive" mode is labeled "experimental and subject to change", but in testing
# it results in significantly better behavior than either of the other modes.
# Reducing the number of workers (with limited testing) seemed to result in
# either no or negative impact (*MORE* API warnings).
export AWS_RETRY_MODE=adaptive # or "legacy" (default) or "standard"
Do this before running scout; the underlying boto3 will pick up the environment variable and adapt accordingly. If you RTFM, Amazon claims "standard" is the default behavior, but that appears to be for awscli and not for API calls (i.e. via boto3).
That was a quick reply 👍
I run export AWS_RETRY_MODE=adaptive
before executing scout aws
for scanning three different environments and I got only one error and it passed on the second retry.
Thank you!
Describe the bug
I was just reviewing a run log for one of our nontrivial AWS accounts, and got pages and pages of this:
For context, here is what we're probing:
Even if I presume that each of those resources is scanned in parallel and independently hitting its own rate limit, the frequency of the announcements suggests whatever actual limit is being applied is far less than the claimed 15 seconds. (Or, many many pages of output later, 30 seconds, and then 45 seconds...)
The scan eventually succeeds (at timestamp
2022-01-25T07:48:42.797-05:00
) but this feels both abusive and inefficient.To Reproduce
This appears to be an intermittent condition. FWIW, the command is:
Where SS_OPTS is
--services cloudformation cloudtrail config ec2 elb elbv2 iam rds redshift s3 vpc
and the other variables are obvious.godaddy.json
is a tweaked version ofdefaults.json
that turns on some checks and turns off some others.This is being produced by ScoutSuite version 5.10.2.