Open paveliak opened 5 years ago
I'm getting a similar issue. Intermittent timeouts when using CreateDocumentAsync
. We use the emulator for integration testing and out of a suite of 150 tests, 1 test will usually fail with the following error.
It's not always the same test. Getting a single failure is very common, getting 2 is rare but does happen.
After looking at cosmos after a test fails, the expected document exists. The operation was a success! it just happened to timeout? Strangely tho, looking at the RequestStartTime
and RequestEndTime
, the request took 0ms?
Let me know if there's anything I can do to help diagnose the issue.
Microsoft.Azure.Documents.DocumentClientException: Message: Request timed out.
ActivityId: e1e607c3-e50e-47bd-a8bd-4cbb04c0ee87, Request URI: /apps/DocDbApp/services/DocDbServer17/partitions/a4cb495d-38c8-11e6-8106-8cdcd42c33be/replicas/1p/, RequestStats:
RequestStartTime: 2019-08-27T11:11:15.5802078Z, RequestEndTime: 2019-08-27T11:11:15.5802078Z, Number of regions attempted: 1
, SDK: Microsoft.Azure.Documents.Common/2.4.0.0, Windows/10.0.18362 documentdb-netcore-sdk/2.4.0
at Microsoft.Azure.Documents.GatewayStoreClient.ParseResponseAsync(HttpResponseMessage responseMessage, JsonSerializerSettings serializerSettings, DocumentServiceRequest request)
at Microsoft.Azure.Documents.GatewayStoreClient.InvokeAsync(DocumentServiceRequest request, ResourceType resourceType, Uri physicalAddress, CancellationToken cancellationToken)
at Microsoft.Azure.Documents.GatewayStoreModel.ProcessMessageAsync(DocumentServiceRequest request, CancellationToken cancellationToken)
at Microsoft.Azure.Documents.Client.DocumentClient.ProcessRequestAsync(DocumentServiceRequest request, IDocumentClientRetryPolicy retryPolicyInstance, CancellationToken cancellationToken)
at Microsoft.Azure.Documents.Client.DocumentClient.CreateDocumentPrivateAsync(String documentCollectionLink, Object document, RequestOptions options, Boolean disableAutomaticIdGeneration, IDocumentClientRetryPolicy retryPolicyInstance, CancellationToken cancellationToken)
at Microsoft.Azure.Documents.BackoffRetryUtility`1.ExecuteRetryAsync(Func`1 callbackMethod, Func`3 callShouldRetry, Func`1 inBackoffAlternateCallbackMethod, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action`1 preRetryCallback)
at Microsoft.Azure.Documents.ShouldRetryResult.ThrowIfDoneTrying(ExceptionDispatchInfo capturedException)
at Microsoft.Azure.Documents.BackoffRetryUtility`1.ExecuteRetryAsync(Func`1 callbackMethod, Func`3 callShouldRetry, Func`1 inBackoffAlternateCallbackMethod, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action`1 preRetryCallback)
at Microsoft.Azure.Documents.Client.DocumentClient.CreateDocumentInlineAsync(String documentsFeedOrDatabaseLink, Object document, RequestOptions options, Boolean disableAutomaticIdGeneration, CancellationToken cancellationToken)
We are also seeing odd timeouts on create document async calls but using azure hosted cosmos db.
I'm facing this with V3 SDK on Azure, not emulator. So we get Request Time, 408 and the item is actually created in the database! This is annoying to workaround as it conflicts with another bug, where if you insert multiple items with the same id across multiple partitions, reading the failed item quickly (using id and partitionKey) gets you the wrong item from a wrong partition.
We are using SDK V3.8.0 on Core 3.1
I am experiencing the same issue using V3 SDK on Azure. Any progress on this issue?
same here. 408 gateway request timeout with V3 SDK on Azure cosmosdb.
Same here, we are using Mongo API and intermittently receive the following error:
Message: {"Errors":["The requested operation exceeded maximum alloted time. Learn more: https:\/\/aka.ms\/cosmosdb-tsg-service-request-timeout"]}
ActivityId: ced398c5-0000-0000-0000-000000000000,
RequestStats:
RequestStartTime: 2020-10-22T00:41:18.0710333Z,
RequestEndTime: 2020-10-22T00:41:23.4809804Z,
Number of regions attempted:1
ResponseTime: 2020-10-22T00:41:23.4809804Z
We are getting same issue. Any update on how to fix this?
{ "Message": "Message: {\"Errors\":[\"The requested operation exceeded maximum alloted time. Learn more: https:\/\/aka.ms\/cosmosdb-tsg-service-request-timeout\"]}\nActivityId: ec8e38ff-986f-41b1-962e-6b8fd4b543d4, Request URI: /apps/ba6fe40a-0ffa-4bb0-ba22-08b36841e9b6/services/9c79c251-626d-42dd-b412-02e522e0034e/partitions/5832230e-6af0-4af1-91be-e534a99b3790/replicas/132438418298304872p//dbs/prod/colls/annualviewcollection_production/docs, RequestStats: \nRequestStartTime: 2020-12-14T20:08:10.4903186Z, RequestEndTime: 2020-12-14T20:08:16.8261747Z, Number of regions attempted:1\nResponseTime: 2020-12-14T20:08:16.8261747Z, StoreResult: StorePhysicalAddress: https://cdb-ms-prod-eastus2-fd5.documents.azure.com:14776/apps/ba6fe40a-0ffa-4bb0-ba22-08b36841e9b6/services/9c79c251-626d-42dd-b412-02e522e0034e/partitions/5832230e-6af0-4af1-91be-e534a99b3790/replicas/132438418298304872p//, LSN: -1, GlobalCommittedLsn: -1, PartitionKeyRangeId: , IsValid: False, StatusCode: 408, SubStatusCode: 0, RequestCharge: 0, ItemLSN: -1, SessionToken: , UsingLocalLSN: False, TransportException: null, ResourceType: Document, OperationType: Upsert\n, SDK: Linux/10 documentdb-netcore-sdk/2.10.0", "ErrorDate": "2020-12-14T20:08:16.8275029Z" },
}
V2 SDK also getting this a few times:
Microsoft.Azure.Documents.RequestTimeoutException: Message: Request timed out. More info: https://aka.ms/cosmosdb-tsg-request-timeout ActivityId: .., Request URI:... , RequestStats: RequestStartTime: 2021-03-01T07:05:01.9165737Z, RequestEndTime: 2021-03-01T07:05:01.9165737Z, Number of regions attempted:1 , SDK: documentdb-dotnet-sdk/2.13.1 Host/64-bit MicrosoftWindowsNT/6.2.9200.0 ---> Microsoft.Azure.Documents.TransportException: A client transport error occurred: Acquiring the send stream lock timed out. (Time: 2021-03-01T07:05:12.9181766Z, activity ID: 2dbb632e-a988-4328-86e8-13fe99d17eb0, error code: SendLockTimeout [0x000D], base error: HRESULT 0x80131500, URI: rntbd://cdb-ms-prod-westus1-fd36.documents.azure.com:14310/apps/.., connection: 10.218.0.4:57965 -> 40.112.241.34:14310, payload sent: True, CPU history: (2021-02-28T23:20:23.0498147Z 16.242), CPU count: 1) at Microsoft.Azure.Documents.Rntbd.Channel.d__13.MoveNext() --- End of stack trace
Hi, I'm also getting a similar error using the Emulator. My observation is that this happens while I'm making concurrent requests, and you can see that it is reporting that its happening immediately (i.e. the start and end time are reported to be identical)
Promise result rejected Error:
Bulk request errored with:
Message: Request timed out. More info: https://aka.ms/cosmosdb-tsg-request-timeout
ActivityId: 98777d16-d3c5-4866-8df6-01de5c07d3f2,
Request URI: /apps/DocDbApp/services/DocDbServer3/partitions/a4cb494f-38c8-11e6-8106-8cdcd42c33be/replicas/1p/,
RequestStats:
RequestStartTime: 2021-11-05T01:52:20.4639778Z,
RequestEndTime: 2021-11-05T01:52:20.4639778Z,
Number of regions attempted:1
, SDK: Microsoft.Azure.Documents.Common/2.14.0
I'm not sure if we can speak of a fix but I experienced this error and it was due to the indexing policy. Removing indexing policy before upserting new data and re-setting it right after allowed me to get rid of this error.
I'm experiencing similar issues on cosmos v3 against the emulator, but only when running via Azure hosted pipeline agents and not locally. I found that increasing the partition count and container throughput reduced the frequency of the error (see: https://github.com/Azure/azure-cosmos-dotnet-v2/issues/817#issuecomment-927133201) but, not enough for me to consider using an azure hosted cosmos instance or try with a self-hosted pipeline agent
is there any progress on this? it's 2023 and the same issue still exists regularly during integration testing. I am getting this using the gremlin API
it's 2024 and the issue still exists
Occasionally we observe RequestTimeout exceptions for Upsert operations of TableAPI when using CosmosDB emulator. They go away upon immediate retry.