Azure / azure-cosmosdb-java

Java Async SDK for SQL API of Azure Cosmos DB
MIT License
54 stars 61 forks source link

RetryOptions.setMaxRetryAttemptsOnThrottledRequests(0) is ignored #322

Open lightningrob opened 4 years ago

lightningrob commented 4 years ago

Describe the bug I setup an AsyncDocumentClient with ConnectionPolicy/RetryOptions with MaxRetryAttemptsOnThrottledRequests = 0, and used it for Cosmos operations (query/insert/delete). During execution, I can see many warnings in the log that the ResourceThrottleRetryPolicy is still performing retries under the covers. There seems to be no way to prevent it.

To Reproduce

    ConnectionPolicy connectionPolicy = new ConnectionPolicy();
    // Configure Cosmos with no retries.
    RetryOptions noRetries = new RetryOptions();
    noRetries.setMaxRetryAttemptsOnThrottledRequests(0);
    connectionPolicy.setRetryOptions(noRetries);
    asyncDocumentClient = new AsyncDocumentClient.Builder().withServiceEndpoint(azureCloudConfig.cosmosEndpoint)
        .withConnectionPolicy(connectionPolicy)
        .build();
  // Use client for all operations with high request rate

Expected behavior I expect every 429 throttling error to fail immediately back to the client with no retries.

Actual behavior I see many instances of this warning and long stacktrace (abridged here) indicating that the option is being ignored.

[2020-03-07 02:31:46,296] WARN Operation will be retried after 2686 milliseconds. Current attempt 1, Cumulative delay PT2.686S (com.microsoft.azure.cosmosdb.rx.internal.ResourceThrottleRetryPolicy)
DocumentClientException{error={"code":"TooManyRequests","message":"null, StatusCode: TooManyRequests","additionalErrorInfo":null}, resourceAddress='null', statusCode=429, message=null, StatusCode: TooManyRequests, getCauseInfo=null, responseHeaders={content-length=1215, x-ms-current-replica-set-size=4, Server=Microsoft-HTTPAPI/2.0, Content-Location=https://wus2ambryprodcosmosdb1-westus2.documents.azure.com/dbs/ambry-metadata-main/colls/blob-metadata/docs/AAYQAgRhAAkAAQAAAAAAAAkprGBjUiYiQ8WgEXAlehN3wg, lsn=173311327, x-ms-request-charge=0.38, x-ms-schemaversion=1.9, x-ms-transport-request-id=1030695, x-ms-number-of-read-regions=0, x-ms-current-write-quorum=3, x-ms-cosmos-quorum-acked-llsn=173311327, x-ms-quorum-acked-lsn=173311327, x-ms-activity-id=a042cdfb-73b5-4a7c-abbe-c8e56d142c77, Date=Sat, 07 Mar 2020 02:31:46 GMT, x-ms-xp-role=1, Strict-Transport-Security=max-age=31536000, x-ms-retry-after-ms=2686, x-ms-global-Committed-lsn=173311327, x-ms-cosmos-llsn=173311327, x-ms-gatewayversion=version=2.9.2, x-ms-serviceversion=version=2.9.0.0, Content-Type=application/json, x-ms-substatus=3200}, requestHeaders={authorization=type%3Dmaster%26ver%3D1.0%26sig%3DjfKdP4mBXFWm9w3JUd%2BOhuuBoBgcdSqmO6NoHHwFPWM%3D, Accept=application/json, x-ms-session-token=1007:-1#173311327, x-ms-date=Sat, 07 Mar 2020 02:31:46 GMT, x-ms-documentdb-partitionkey=["2345"], x-ms-consistency-level=Session}}
    at com.microsoft.azure.cosmosdb.rx.internal.RxGatewayStoreModel.validateOrThrow(RxGatewayStoreModel.java:444)
    at com.microsoft.azure.cosmosdb.rx.internal.RxGatewayStoreModel.lambda$null$8(RxGatewayStoreModel.java:382)

Environment summary SDK Version: 2.6.3 Java JDK version: 1.8.0 OS Version (e.g. Windows, Linux, MacOSX) Linux 3.10.0-862.11.6.el7.x86_64

Additional context This is a multi-region latency sensitive app and it needs full control over request error handling. The client library needs to respect the options provided. We have already spent quite a bit of time migrating our app to this version of Cosmos client and do not wish to upgrade to a newer major release. Please advise if I am missing something about the usage or if there is another way to disable auto retries.