Azure / azure-cosmosdb-java

Java Async SDK for SQL API of Azure Cosmos DB
MIT License
54 stars 61 forks source link

Retry not happening with Direct mode #263

Closed chetanmeh closed 5 years ago

chetanmeh commented 5 years ago

While using latest v2.6.1 of SDK we are not seeing retry happening when Direct mode is used. Retry works as expected for Gateway mode.

Upon debugging it appears that RequestRateTooLargeException is getting created with statusCode 404 and hence ignored by ResourceThrottleRetryPolicy

https://github.com/Azure/azure-cosmosdb-java/blob/f8a1b4d7a67639dab5cf8e9a177c12558552381e/direct-impl/src/main/java/com/microsoft/azure/cosmosdb/internal/directconnectivity/rntbd/RntbdRequestManager.java#L824-L826

Can someone confirm if retry mode is supported for Direct mode

kushagraThapar commented 5 years ago

@chetanmeh Thanks for capturing this, we will investigate this issue.

David-Noble-at-work commented 5 years ago

Problem diagnosed and I'm reasonably confident the issue is addressed on branch danoble/issue-263/retry-not-happening. I fixed two definite issues and added test coverage to guard against regressions in the future:

PR is pending.

chetanmeh commented 5 years ago

@David-Noble-at-work Does the same issue exist in v3 SDK or there retry support work as expected?

kushagraThapar commented 5 years ago

@David-Noble-at-work Does the same issue exist in v3 SDK or there retry support work as expected?

It might, if that's the case we will confirm and port the fix to v3 as well.

kushagraThapar commented 5 years ago

Problem diagnosed and I'm reasonably confident the issue is addressed on branch danoble/issue-263/retry-not-happening. I fixed two definite issues and added test coverage to guard against regressions in the future:

  • RequestRateTooLargeException.getStatusCode now correctly returns TOO_MANY_REQUESTS, not NOTFOUND.
  • ServiceUnavailableException.getStatusCode now correctly returns SERVICE_UNAVAILABLE, not NOTFOUND.
  • DocuemntClientExceptionTest.statusCodeIsCorrect verifies that the correct status code is returned for all DocumentClientException subtypes.

PR is pending.

Thank you David.

kushagraThapar commented 5 years ago

This issue has been fixed in the PR: https://github.com/Azure/azure-cosmosdb-java/pull/272

We will release 2.6.3 with the fix.