Closed sfc-gh-afedorov closed 1 month ago
Thank you for reporting the crash!
I briefly went through the stacktrace and believe the issue is not directly related to aiobotocore, but rather lies with response parsing done in botocore.
The stacktrace suggests that the crash occurs when working with pagination, which could be relevant.
Could you please provide more context? Which API call and client configuration is triggering the exception? It could well be an edge case that needs to be raised with botocore, but more information would be useful to tell for sure.
Thanks for the prompt reply! I see it happening with both paginated and unpaginated API calls, e.g. iam
's generate_credential_report
and sts
's assume_role
(I'm creating sessions to ingest inventory from a lot of accounts).
I think when the response status is 429, the response should not be parsed, as it's an empty string.
Happy to expound more but want to be specific as to what might be most helpful. The use-case I'm working on is a bit unique in snowflakedb/SnowAlert/pull/676 because there are a lot of concurrent sessions, but I think I can probably reproduce it more simply or just write a local unit test if that's more helpful.
Thank you for following up and providing helpful context.
I still feel that this should be fixed in botocore. There is an open issue that could be similar, but has not been fixed AFAIK: https://github.com/boto/botocore/issues/1732
Is there a way to replicate the exception with botocore? If botocore itself is also affected, we should raise the issue there. Otherwise, I am obviously missing something.
That ticket says "There's also a too_many_requests that only applies for 429 error codes" which is what's also in the Retry docs I linked. They're asking that botocore also handle status=400 errors specifically for requests to cognito-idp endpoints.
The _retry.json
they mention already has the 429 error defined as "too_many_requests" here which looks like it uses random delay with an growth_factor
of 2 for 5 retries.
I'm not sure if it can be fixed in botocore since it actually needs to retry on a 429 response, which only aiobotocore can do because it needs to use aio?
Did you customise your retry config? By default, legacy mode is used which does not appear to handle 429 (TooManyRequestsException). You might need to use the standard mode in your scenario.
Thanks for suggesting that — I had tried it briefly before but also just now with retries={'mode': 'standard'}
and also adaptive
, and I'm still getting the same error on sts assume-role
:
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/connectors/utils.py:157 in aio_sts_assume_role
2024-08-02T22:54:11.205Z src_role = await sts.assume_role(
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/aiobotocore/client.py:391 in _make_api_call
2024-08-02T22:54:11.205Z http, parsed_response = await self._make_request(
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/aiobotocore/client.py:419 in _make_request
2024-08-02T22:54:11.205Z return await self._endpoint.make_request(
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/aiobotocore/endpoint.py:99 in _send_request
2024-08-02T22:54:11.205Z success_response, exception = await self._get_response(
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/aiobotocore/endpoint.py:141 in _get_response
2024-08-02T22:54:11.205Z success_response, exception = await self._do_get_response(
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/aiobotocore/endpoint.py:217 in _do_get_response
2024-08-02T22:54:11.205Z parsed_response = parser.parse(
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/botocore/parsers.py:250 in parse
2024-08-02T22:54:11.205Z parsed = self._do_error_parse(response, shape)
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/botocore/parsers.py:559 in _do_error_parse
2024-08-02T22:54:11.205Z root = self._parse_xml_string_to_dom(xml_contents)
2024-08-02T22:54:11.205Z ./../../venv/lib/python3.9/site-packages/botocore/parsers.py:508 in _parse_xml_string_to_dom
2024-08-02T22:54:11.205Z raise ResponseParserError(
2024-08-02T22:54:11.205Z botocore.parsers.ResponseParserError: Unable to parse response (no element found: line 1, column 0), invalid XML received. Further retries may succeed:
2024-08-02T22:54:11.205Z b''
I need to understand this functionality a little better still because I'm not sure how aiobotocore can delegate retries to a dependency that implements synchronous calls. Do you know if there is some kind of inversion of control mechanism where botocore lets aiobotocore handle the aio requests while it handles the retry logic?
In the meantime, manually setting up retries where I see these errors should work for my use-case.
let's figure this out. is it possible to put a breakpoint and debug? if you can capture the response it would help. or modify code to add a conditional breakpoint statement. I can probably look deeper into this in a couple of days
Yes, that's how I knew the response status was 429. Let me know if you'd like me to gather any other details.
if it happens in aiobotocore you can definitely try reproing it with botocore, they're there same just one is async. most likely like stated above is actually a botocore bug. we can try porting the relevant unittests to ensure this is the case
check this out: https://github.com/boto/botocore/issues/1830
given someone reproduced this in botocore going to close as a botocore issue. Please report there with botocore example and let us know when this is fixed!
Describe the bug On high throughput use or a busy account, AWS responds with 429, which botocore Retry documentation says is a condition for a retry, but aiobotocore crashes with
Checklist
pip check
passes without errorspip freeze
resultspip freeze results
Environment: