Closed letmaik closed 5 years ago
We are also encountering this issue. The default requests is currently set to 60 and there is no way to change this or pass in retry logic.
Hi @srinathnarayanan can you take a look at this issue? It is currently affecting our application in production. I can provide more info... I looked at the next version at https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/cosmos/azure-cosmos/azure/cosmos/_cosmos_client_connection.py and I don't see that this has been resolved there either. So probably this issue needs to be taken into the next version.
The Fix is present in V 3.1.2 and V4.0.0b4 of azure-cosmos
When an HTTP error occurs, e.g. 429, then everything is handled as expected, triggering retries etc. and using the retry options of the connection policy of pydocumentdb.
However, if there is a network error such that a server is not reachable, then this results in an immediate exception without retries. This is because of two things:
pydocumentdb's retry_utility code only handles
errors.HTTPFailure
errors, which are HTTP errors corresponding to certain HTTP status codes, e.g. 429: https://github.com/Azure/azure-documentdb-python/blob/07e2f3f93ad5abeb114c2d2f83577c25d18f0bb4/pydocumentdb/retry_utility.py#L66pydocumentdb uses
requests
to do the actual network requests, however it sets up the requests session with the defaults only which doesn't enable retrying: https://github.com/Azure/azure-documentdb-python/blob/07e2f3f93ad5abeb114c2d2f83577c25d18f0bb4/pydocumentdb/document_client.py#L134 This then in turn leads to the underlying urllib3 not to retry such requests: https://github.com/urllib3/urllib3/blob/1.19.1/urllib3/util/retry.py#L331-L336 (read
would be false with default options).An approach as described in https://www.peterbe.com/plog/best-practice-with-retries-with-requests is typically used to enable retries with
requests
:The following is an exception trace resulting from trying to create a document in CosmosDB when the server is unreachable: