Azure / Azure-Functions

1.1k stars 189 forks source link

func.exe exited with code -532462766 after many 429 from cosmosDB #2456

Open alex-shmyga opened 3 months ago

alex-shmyga commented 3 months ago

I have an in-process az function (.net6, runs in openshift cluster via KEDA operator) which stores data in cosmos db(via output binding).

The rate of storing data is quite high and it gets 429 from cosmos quite often, but what happens during too many 429s that functions shuts down.

The same issue is reproducible locally as well.

Is there any way how the functionHost can be preventing from shutting down?

Microsoft.Azure.Documents.DocumentClientException: Message: {"Errors":["Request rate is large. More Request Units may be needed, so no changes were made. Please retry this request later. Learn more: http://aka.ms/cosmosdb-error-429"]}
ActivityId: 581a0346-b943-499a-9aa4-b2fb693f5038, Request URI: /apps/DocDbApp/services/DocDbServer12/partitions/a4cb4958-38c8-11e6-8106-8cdcd42c33be/replicas/1p/, RequestStats:
RequestStartTime: 2024-03-05T16:35:27.4034647Z, RequestEndTime: 2024-03-05T16:35:27.4534628Z,  Number of regions attempted:1
{"systemHistory":[{"dateUtc":"2024-03-05T16:34:31.1766864Z","cpu":80,781,"memory":11124208,000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0,0594,"availableThreads":32766,"minThreads":8,"maxThreads":32767},"numberOfOpenTcpConnection":174},{"dateUtc":"2024-03-05T16:34:41.1881854Z","cpu":77,808,"memory":10659388,000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0,0477,"availableThreads":32765,"minThreads":8,"maxThreads":32767},"numberOfOpenTcpConnection":174},{"dateUtc":"2024-03-05T16:34:51.1991129Z","cpu":88,358,"memory":10167472,000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0,4679,"availableThreads":32766,"minThreads":8,"maxThreads":32767},"numberOfOpenTcpConnection":174},{"dateUtc":"2024-03-05T16:35:01.2204908Z","cpu":75,683,"memory":9804104,000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":1,4462,"availableThreads":32730,"minThreads":8,"maxThreads":32767},"numberOfOpenTcpConnection":174},{"dateUtc":"2024-03-05T16:35:11.2412573Z","cpu":55,577,"memory":9830444,000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0,6353,"availableThreads":32766,"minThreads":8,"maxThreads":32767},"numberOfOpenTcpConnection":174},{"dateUtc":"2024-03-05T16:35:21.2532395Z","cpu":59,926,"memory":9955544,000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0,0554,"availableThreads":32765,"minThreads":8,"maxThreads":32767},"numberOfOpenTcpConnection":174}]}
RequestStart: 2024-03-05T16:35:27.4034647Z; ResponseTime: 2024-03-05T16:35:27.4534628Z; StoreResult: StorePhysicalAddress: rntbd://127.0.0.1:10253/apps/DocDbApp/services/DocDbServer12/partitions/a4cb4958-38c8-11e6-8106-8cdcd42c33be/replicas/1p/, LSN: 5472, GlobalCommittedLsn: -1, PartitionKeyRangeId: , IsValid: True, StatusCode: 429, SubStatusCode: 3200, RequestCharge: 0.38, ItemLSN: -1, SessionToken: , UsingLocalLSN: False, TransportException: null, BELatencyMs: , ActivityId: 581a0346-b943-499a-9aa4-b2fb693f5038, RetryAfterInMs: 21019, TransportRequestTimeline: {"requestTimeline":[{"event": "Created", "startTimeUtc": "2024-03-05T16:35:27.4034647Z", "durationInMs": 0.0129},{"event": "ChannelAcquisitionStarted", "startTimeUtc": "2024-03-05T16:35:27.4034776Z", "durationInMs": 0.0025},{"event": "Pipelined", "startTimeUtc": "2024-03-05T16:35:27.4034801Z", "durationInMs": 29.9496},{"event": "Transit Time", "startTimeUtc": "2024-03-05T16:35:27.4334297Z", "durationInMs": 0.0627},{"event": "Received", "startTimeUtc": "2024-03-05T16:35:27.4334924Z", "durationInMs": 20.0997},{"event": "Completed", "startTimeUtc": "2024-03-05T16:35:27.4535921Z", "durationInMs": 0}],"serviceEndpointStats":{"inflightRequests":188,"openConnections":170},"connectionStats":{"waitforConnectionInit":"False","callsPendingReceive":1,"lastSendAttempt":"2024-03-05T16:35:27.3434518Z","lastSend":"2024-03-05T16:35:27.2964616Z","lastReceive":"2024-03-05T16:35:27.2974573Z"},"requestSizeInBytes":793,"requestBodySizeInBytes":300,"responseMetadataSizeInBytes":127,"responseBodySizeInBytes":174};
 ResourceType: Document, OperationType: Upsert
, SDK: Microsoft.Azure.Documents.Common/2.14.0, Windows/10.0.19045 documentdb-netcore-sdk/2.13.1
   at Microsoft.Azure.Documents.GatewayStoreClient.ParseResponseAsync(HttpResponseMessage responseMessage, JsonSerializerSettings serializerSettings, DocumentServiceRequest request)
   at Microsoft.Azure.Documents.GatewayStoreClient.InvokeAsync(DocumentServiceRequest request, ResourceType resourceType, Uri physicalAddress, CancellationToken cancellationToken)
   at Microsoft.Azure.Documents.GatewayStoreModel.ProcessMessageAsync(DocumentServiceRequest request, CancellationToken cancellationToken)
   at Microsoft.Azure.Documents.Client.DocumentClient.ProcessRequestAsync(DocumentServiceRequest request, IDocumentClientRetryPolicy retryPolicyInstance, CancellationToken cancellationToken)
   at Microsoft.Azure.Documents.Client.DocumentClient.ProcessRequestAsync(String verb, DocumentServiceRequest request, IDocumentClientRetryPolicy retryPolicyInstance, CancellationToken cancellationToken, String testAuthorization)
   at Microsoft.Azure.Documents.Client.DocumentClient.UpsertDocumentPrivateAsync(String documentCollectionLink, Object document, RequestOptions options, Boolean disableAutomaticIdGeneration, IDocumentClientRetryPolicy retryPolicyInstance, CancellationToken cancellationToken)
   at Microsoft.Azure.Documents.BackoffRetryUtility`1.ExecuteRetryAsync(Func`1 callbackMethod, Func`3 callShouldRetry, Func`1 inBackoffAlternateCallbackMethod, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action`1 preRetryCallback)
   at Microsoft.Azure.Documents.ShouldRetryResult.ThrowIfDoneTrying(ExceptionDispatchInfo capturedException)
   at Microsoft.Azure.Documents.BackoffRetryUtility`1.ExecuteRetryAsync(Func`1 callbackMethod, Func`3 callShouldRetry, Func`1 inBackoffAlternateCallbackMethod, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action`1 preRetryCallback)
   at Microsoft.Azure.Documents.Client.DocumentClient.UpsertDocumentInlineAsync(String documentsFeedOrDatabaseLink, Object document, RequestOptions options, Boolean disableAutomaticIdGeneration, CancellationToken cancellationToken)
   at Microsoft.Azure.WebJobs.Extensions.CosmosDB.CosmosDBService.UpsertDocumentAsync(Uri documentCollectionUri, Object document) in C:\azure-webjobs-sdk-extensions\src\WebJobs.Extensions.CosmosDB\Services\CosmosDBService.cs:line 63
   at Microsoft.Azure.WebJobs.Extensions.CosmosDB.CosmosDBAsyncCollector`1.UpsertDocument(CosmosDBContext context, T item) in C:\azure-webjobs-sdk-extensions\src\WebJobs.Extensions.CosmosDB\Bindings\CosmosDBAsyncCollector.cs:line 78
   at Microsoft.Azure.WebJobs.Extensions.CosmosDB.CosmosDBAsyncCollector`1.AddAsync(T item, CancellationToken cancellationToken) in C:\azure-webjobs-sdk-extensions\src\WebJobs.Extensions.CosmosDB\Bindings\CosmosDBAsyncCollector.cs:line 58
   at MyFunction.Features.MyFunction.Functions.MyFunctionFunction.<>c__DisplayClass4_0.<<Run>b__0>d.MoveNext() in C:\Projects\MyFunctionApp\Functions\MyFunction\MyFunction\Features\MyFunction\Functions\MyFunctionFunction.cs:line 83
--- End of stack trace from previous location ---
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallback.<>c.<.cctor>b__6_0(QueueUserWorkItemCallback quwi)
   at System.Threading.ExecutionContext.RunForThreadPoolUnsafe[TState](ExecutionContext executionContext, Action`1 callback, TState& state)
   at System.Threading.QueueUserWorkItemCallback.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()

C:\Users\{myuser}\AppData\Local\AzureFunctionsTools\Releases\4.69.0\cli_x64\func.exe (process 5636) exited with code -532462766.
Press any key to close this window . . .