Open torepaulsson opened 1 year ago
@torepaulsson does this issue occur if you upgrade to the latest version of Microsoft.Azure.Functions.Worker.Extensions.DurableTask
(v1.0.2)?
FYI @jviau
I don't recall 1.0.2
having any fixes related to this (but I may be wrong). I think what @torepaulsson is touching on is that in dotnet-isolated we switched to a failure-details only mode, and explicitly do not attempt to serializing/deserialize the original exception across the boundary. We do have a TODO to populate nested exceptions, but I think I want to look into this experience a bit more, I am not entirely sold it was the right call to remove exception propagation across the serialization boundary.
As for AggregateException
- are you calling any Task.Result
or Task.Wait
APIs? Those will cause any exception to be surfaced as AggregateException
.
I'm curious to understand this better:
Throwing an exception in an activity function causes a
DurableTask.Core.Exceptions.TaskFailedException
exception.
There shouldn't be any DurableTask.Core
exceptions surfaced to your code. If you're seeing this, then it's probably something that needs to be fixed. I've seen similar reports from Dapr users as well, which makes me think our abstraction may still be a bit leaky in some places.
@jviau I believe it has to do with the serialization as you say, and if you remember that it is removed then that is probably it!
@cgillum, DurableTask.Core comes as an inner exception from the aggregate exception.
[Function(nameof(OrchestrateAsync))]
public async Task OrchestrateAsync([OrchestrationTrigger] TaskOrchestrationContext context)
{
var replaySafeLogger = context.CreateReplaySafeLogger<Orchestrator>();
try
{
await context.CallActivityAsync(nameof(ThrowExceptionActivityAsync), context.InstanceId);
}
catch (Exception e)
{
replaySafeLogger.LogError(e, "Orchestrator failed");
throw;
}
}
[Function(nameof(ThrowExceptionActivityAsync))]
public Task ThrowExceptionActivityAsync([ActivityTrigger] string instanceId)
{
throw new InvalidOperationException("Just a test!");
}
@jviau, in the above code i get an AggregateException
but I did not use Task.Result or Task.Wait anywhere here, maybe there is a call to that method inside the code that handles the activities? In the CallActivityAsync maybe?
Packages used
<PackageReference Include="Microsoft.Azure.Functions.Worker" Version="1.14.1" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="1.10.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Http" Version="3.0.13" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.DurableTask" Version="1.0.2" />
<PackageReference Include="Microsoft.Azure.WebJobs.Extensions.DurableTask" Version="2.9.5" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection" Version="7.0.0" />
For AggregateException
, I believe this is the cause: https://github.com/Azure/azure-functions-dotnet-worker/issues/993.
Will need to see if we can even work around this, given it is an issue with the dotnet worker and not durable.
After migrating to the Isolated worker process, we've encountered the same issue with handling exceptions. The FunctionFailedException
is no longer accessible in the latest worker packages. Despite extensive searching, there's no clear documentation or online resources addressing this issue.
The documentation states that exceptions thrown in activity functions should be marshaled back to the orchestrator function as FunctionFailedException
. However, in our case, they are propagating up as TaskFailedException
. This behavior differs from the expected outcome, as outlined in this open ticket.
We kindly request an update on the progress towards addressing this issue. The inability to utilize FunctionFailedException
is hindering our ability to effectively manage exceptions in the Isolated worker process. A prompt resolution would be greatly appreciated.
Example of TaskFailedException
surfacing
Code changes we had to make when moving to an Isolated worker.
Any updates on this ?
I'm having an isolated/out-of-process project and it's still an issue. When an exception is caught in the Activity itself then you have to/can handle it yourself (e.g. log it)
Also. the RetryPolicy like the docs are mentioning aren't working either?
<PackageReference Include="Microsoft.Azure.Functions.Worker" Version="1.21.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.DurableTask" Version="1.1.1" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.DurableTask.SqlServer" Version="1.2.2" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.OpenApi" Version="1.5.1" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.ServiceBus" Version="5.17.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Sql" Version="3.0.534" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="1.17.2" />
@eholman @irhad-ljubcic various fixes have been made over the last few months, including a couple in the last week or so which will hopefully get released soon:
Also. the RetryPolicy like the docs are mentioning aren't working either?
In my testing, custom retry handlers have been working fine, so we'll need more details about what specifically isn't working for you.
This is the policy I tried according to the docs:
public static TaskOptions TryPolicy
{
get
{
TaskOptions retryOptions = TaskOptions.FromRetryHandler(retryContext =>
{
// Don't retry anything that derives from ApplicationException
if (retryContext.LastFailure.IsCausedBy<ApplicationException>())
{
return false;
}
// Quit after N attempts
return retryContext.LastAttemptNumber < 3;
});
return retryOptions;
}
}
I have a FluentValidator in the MediatR pipeline which throws an ValidationException inside my activity:
[Function(nameof(ProcessImportData))]
public static async Task<IEnumerable<Guid>> ProcessImportData(
[ActivityTrigger] OrchestrationData<Metadata.Import, List<Guid>> orchestrationData, FunctionContext executionContext)
{
var sender = executionContext.InstanceServices.GetRequiredService<ISender>();
var orderIds = await sender.Send(new ProcessImportDataCommand(orchestrationData.Metadata.ImportData));
return orderIds;
}
WIth this setup, the retry policy kicks in but I expect at least some info regarding the ValidationException or am I wrong here?
Within the orchestrationTrigger method I'm triggering the activity like:
await context.CallActivityAsync<IEnumerable<Guid>>(nameof(Activities.Import.ProcessImportData), orchestrationData,
RetryPolicies.Import.TryPolicy)
I've just tried with the pre-release version 1.2.0-rc.1 of Microsoft.Azure.Functions.Worker.Extensions.DurableTask
@eholman Okay, so it sounds like the retry handler is getting triggered, so retry handlers per-se are not the problem. Rather, the problem is that the exception message you're getting from LastFailure
is not what you expected. I wanted to clarify this because the exception information is not specific to retry handlers. It's the same information you'd get if you caught a TaskFailedException
in regular orchestrator code, which I assume would be having the exact same problem.
Can you share the contents of your .csproj? I recall there was a .NET Isolated worker bug where ordinary exceptions were being misinterpreted as AggregateException
, which I assume is the problem you're seeing. This happens outside of the Durable Functions extension - we just surface what's given to us by the .NET Isolated worker. There was a fix and I'd like to understand if you're using the latest .NET Isolated worker SDKs.
Regardless, I suspect there's an issue where Durable isn't capturing inner exceptions. This is something we can continue to look into.
Thanks @cgillum; apologies for the late response!
Yes, that's correct; the retry handler isn't the issue per se, I understand that it would be the same information outside of it. It was an use case where it is very useful to know the underlying exception, at least with respect to a ValidationException which doesn't need to be retried.
Here is the csproj:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net6.0</TargetFramework>
<AzureFunctionsVersion>v4</AzureFunctionsVersion>
<OutputType>Exe</OutputType>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
<LangVersion>preview</LangVersion>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="JsonSchema.Net.Generation" Version="4.1.1" />
<PackageReference Include="Microsoft.ApplicationInsights" Version="2.22.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker" Version="1.21.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.DurableTask" Version="1.2.0-rc.1" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.DurableTask.SqlServer" Version="1.2.2" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.OpenApi" Version="1.5.1" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.ServiceBus" Version="5.17.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Sql" Version="3.0.534" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="1.17.2" />
<PackageReference Include="Microsoft.EntityFrameworkCore.Design" Version="7.0.11">
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
</PackageReference>
<PackageReference Include="Serilog" Version="3.1.1" />
<PackageReference Include="Serilog.Enrichers.CorrelationId" Version="3.0.1" />
<PackageReference Include="Serilog.Exceptions" Version="8.4.0" />
<PackageReference Include="Serilog.Extensions.Logging" Version="8.0.0" />
<PackageReference Include="Serilog.Sinks.ApplicationInsights" Version="4.0.0" />
<PackageReference Include="Serilog.Sinks.Console" Version="5.0.1" />
<PackageReference Include="Serilog.Sinks.Http" Version="8.0.0" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\xx.Infrastructure\xx.Infrastructure.csproj" />
</ItemGroup>
<ItemGroup>
<None Update="host.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Update="local.settings.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<CopyToPublishDirectory>Never</CopyToPublishDirectory>
</None>
</ItemGroup>
<ItemGroup>
<Using Include="System.Threading.ExecutionContext" Alias="ExecutionContext" />
</ItemGroup>
</Project>
I'm working on some fixes which will address the missing inner exception problem. Stay tuned for more updates.
I'm working on some fixes which will address the missing inner exception problem. Stay tuned for more updates.
Were there any updates on this? There seem to be a few issues referencing similar problems. My project is using
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.DurableTask" Version="1.1.4" />
and I have enabled EnableUserCodeException
in my HostBuilder.
services.Configure<WorkerOptions>(options =>
{
options.EnableUserCodeException = true;
});
However, I'm still not seeing the details of my custom exception that is thrown from the sub orchestrator:
My exception is this:
public class SO_QueueQuestionsForLearnerQueueItemsException : Exception
{
public SO_QueueQuestionsForLearnerQueueItemsFailureStep FailureStep { get; set; }
public LearnerQuestionQueueItem QueueItem { get; set; }
public SO_QueueQuestionsForLearnerQueueItemsException()
{
}
public SO_QueueQuestionsForLearnerQueueItemsException(string message, SO_QueueQuestionsForLearnerQueueItemsFailureStep failureStep, LearnerQuestionQueueItem item)
: base(message)
{
FailureStep = failureStep;
QueueItem = item;
}
public SO_QueueQuestionsForLearnerQueueItemsException(string message, Exception inner, SO_QueueQuestionsForLearnerQueueItemsFailureStep failureStep, LearnerQuestionQueueItem item)
: base(message, inner)
{
FailureStep = failureStep;
QueueItem = item;
}
}
We're also seeing the same behavior of our Durable Functions Activity Functions only reporting AggregateExceptions
in the FailueDetails
so we are unable to identify them in the Orchestrator.
We've confirmed that the Activity itself throws the correct Exception but the root cause of the AggregateExceptions seems to be the internal .Result
call in the ContinueWith
section of the DefaultFunctionInvoker
in dotnet worker here:
I would have expected the generated DirectFunctionExecutor
to skip the DefaultFunctionExecutor
and just execute the Activity Function but that doesn't seem to be the case.
@cgillum I see there were some changes to the DefaultFunctionExecutor
class related to Durable Functions but it has been quite some time ago. Could this be related?
Description
This might be more of an oversight or missing for some other reason, anyways here it is:
Throwing an exception in an activity function causes a
DurableTask.Core.Exceptions.TaskFailedException
exception. I cannot find a way to extract the root exception from this exception type. Is it supposed to be there? All I get is a message and a type, but the type is anAggregateException
.Also, the documentation here https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-error-handling?tabs=csharp-isolated#errors-in-activity-functions states that it should be a
FunctionFailedException
exception. When I used inprocess durable functions this exception was thrown and it was possible to extract the root exception. In isolated the documentation is wrong.Expected behavior
The exception thrown contains the root exception
Actual behavior
The exception thrown containsonly the message of the root exception
App Details