Azure / azure-functions-durable-extension

Durable Task Framework extension for Azure Functions
MIT License
711 stars 263 forks source link

Error: target worker count for 'functionactivityname..HandleActivity' resulted in exception. Metrics: workItemQueueLength=. controlQueueLengths=. maxConcurrentOrchestrators=70. maxConcurrentActivities=100 Index was outside the bounds of the array. #2738

Open ericleigh007 opened 5 months ago

ericleigh007 commented 5 months ago

Description

A clear and concise description of what the bug is. Please make an effort to fill in all the sections below; the information will help us investigate your issue.

Running Azure Durable Functions in our project and receiving this exception:

Error: target worker count for 'functionactivityname..HandleActivity' resulted in exception. Metrics: workItemQueueLength=. controlQueueLengths=. maxConcurrentOrchestrators=70. maxConcurrentActivities=100 Index was outside the bounds of the array. 

Stack trace:

System.Exception:
   at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableTaskTargetScaler+<GetScaleResultAsync>d__13.MoveNext (Microsoft.Azure.WebJobs.Extensions.DurableTask, Version=2.0.0.0, Culture=neutral, PublicKeyToken=014045d636e89289: D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\Listener\DurableTaskTargetScaler.cs:89)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Scale.ScaleManager+<GetTargetScalersResultAsync>d__12.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Scale\ScaleManager.cs:167)
Inner exception System.IndexOutOfRangeException handled at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableTaskTargetScaler+<GetScaleResultAsync>d__13.MoveNext:
   at DurableTask.AzureStorage.Monitoring.DisconnectedPerformanceMonitor+QueueMetricHistory.Add (DurableTask.AzureStorage, Version=1.16.0.0, Culture=neutral, PublicKeyToken=d53979610a6e89dd: /_/src/DurableTask.AzureStorage/Monitoring/DisconnectedPerformanceMonitor.cs:558)
   at DurableTask.AzureStorage.Monitoring.DisconnectedPerformanceMonitor+<UpdateQueueMetrics>d__29.MoveNext (DurableTask.AzureStorage, Version=1.16.0.0, Culture=neutral, PublicKeyToken=d53979610a6e89dd)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at DurableTask.AzureStorage.Monitoring.DisconnectedPerformanceMonitor+<PulseAsync>d__28.MoveNext (DurableTask.AzureStorage, Version=1.16.0.0, Culture=neutral, PublicKeyToken=d53979610a6e89dd: /_/src/DurableTask.AzureStorage/Monitoring/DisconnectedPerformanceMonitor.cs:138)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableTaskMetricsProvider+<GetMetricsAsync>d__6.MoveNext (Microsoft.Azure.WebJobs.Extensions.DurableTask, Version=2.0.0.0, Culture=neutral, PublicKeyToken=014045d636e89289: D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\Listener\DurableTaskMetricsProvider.cs:41)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableTaskTargetScaler+<GetScaleResultAsync>d__13.MoveNext (Microsoft.Azure.WebJobs.Extensions.DurableTask, Version=2.0.0.0, Culture=neutral, PublicKeyToken=014045d636e89289: D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\Listener\DurableTaskTargetScaler.cs:46)

NOTE: JavaScript issues should be reported here: https://github.com/Azure/azure-functions-durable-js

Expected behavior

A clear and concise description of what you expected to happen. Expect this seemingly clear error to not occur

Actual behavior

A clear and concise description of what actually happened. Exception is thrown at various times throughout the day.

Relevant source code snippets

// insert code snippet here

Known workarounds

Provide a description of any known workarounds you used.

App Details

Screenshots

If applicable, add screenshots to help explain your problem.

If deployed to Azure

We have access to a lot of telemetry that can help with investigations. Please provide as much of the following information as you can to help us investigate!

If you don't want to share your Function App or storage account name GitHub, please at least share the orchestration instance ID. Otherwise it's extremely difficult to look up information.

cgillum commented 5 months ago

@ericleigh007 is this error impacting the behavior of your app?

ericleigh007 commented 5 months ago

We're building a finops solution here using durable functions.

Any unknown Error level message is a problem, I would say. Scaling is also one of the great unknowns that we don't control, and it can cause double triggers of the same information, leading to more problems.

If this is more of a warning situation, then it should be logged a as a warning, not error. If it is an error, it should be investigated, wouldn't you say?

Can we at least explain what is happening here so that we can at minimum put this in a known problems list?

cgillum commented 5 months ago

Hi @ericleigh007, I'm just looking for additional context to better understand the impact for helping get you mitigated - not trying to suggest that it can or should be ignored.

It clearly looks like there's some unexpected condition that your app is running into. We may need to get more details on how your app is hosted and look at some historical telemetry to understand what's going on. For that reason, I suggest opening an Azure support request and referencing this GitHub issue. That will give us access to a bit more information to make troubleshooting this issue easier.