Azure / azure-functions-durable-extension

Durable Task Framework extension for Azure Functions
MIT License
714 stars 267 forks source link

Orchestrator function sometimes hangs when using Task.ContinueWith #317

Open cgillum opened 6 years ago

cgillum commented 6 years ago

This issue was discovered and diagnosed in this StackOverflow post: https://stackoverflow.com/q/50392874/2069

Issue

The issue is that Task.ContinueWith is sometimes executed asynchronously, breaking the "no-async" rule of orchestrator functions. Internally, this results in the following exception:

System.InvalidOperationException: 'Multithreaded execution was detected. This can happen if the orchestrator function code awaits on a task that was not created by a DurableOrchestrationContext method. More details can be found in this article https://docs.microsoft.com/en-us/azure/azure-functions/durable-functions-checkpointing-and-replay#orchestrator-code-constraints.'

However, nobody actually observes this exception because it happens on the background thread, thus it can easily go undetected.

Workaround

The workaround is to specify TaskContinuationOptions.ExecuteSynchronously when using Task.ContinueWith to schedule subsequent actions. This ensures that the continuation runs on the orchestrator thread, and everything behaves as expected.

Expected Behavior

This should always "just work" - i.e. continuations should always run synchronously to avoid unexpected hangs like this - just like they do when using await. Some additional task scheduler configuration may be necessary to ensure this.

LouisMach commented 3 years ago

I'll add that I also just ran into this problem. It's weird to me that this still occurs since it seems to be the only way to implement a fan-out fan-in pattern with chained asynchronous functions.

Edit: Also, when debugging this locally, the activity within the ContinueWith started firing twice resulting in every database row being duplicated. This only happens if I try to trigger many orchestrations in a short time, if I trigger my Blobtrigger with a single file there are no duplicate insertions.

cgillum commented 3 years ago

There are two ways that you can implement fan-out/fan-in with chained asynchronous functions that don't involve Task.ContinueWith().

The first is to do a fan-out/fan-in of sub-orchestrations where the sub-orchestration is a simple activity chain. This actually gives you even better concurrency compared to doing everything in a single orchestrator function because the sub-orchestrations can be distributed across multiple worker VMs.

The other option is to break up your function chain into separate ordinary async methods, like this:

public static async Task<List<string>> FanOutChain(
    [OrchestrationTrigger] IDurableOrchestrationContext context)
{
    // Run multiple function chain sequences in parallel
    var subChains = new List<Task<List<string>>>();
    for (int i = 0; i < 10; i++)
    {
        Task<List<string>> subChain = Chain(context);
        subChains.Add(subChain);
    }

    // Wait for all the sequences to complete
    List<string>[] results = await Task.WhenAll(subChains);

    // Return all results as a single list
    return results.SelectMany(result => result).ToList();
}

private static async Task<List<string>> Chain(
    IDurableOrchestrationContext context)
{
    var outputs = new List<string>();
    outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "Tokyo"));
    outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "Seattle"));
    outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "London"));
    return outputs;
}

This has the advantage of being much more readable and maintainable compared to Task.ContinueWith, especially if you need to worry about exception handling.

LouisMach commented 3 years ago

Thank you for the reply! I will look into using sub-orchestrators. I was trying to avoid adding a lot of code but perhaps a sub-orchestrator is still a pretty succinct solution.

For the moment, I will try to implement my functions as in your second example.

EDIT: Looking at sub-orchestrations, their use seem very similar to the second example so I might as well opt for them right away.

ConnorMcMahon commented 3 years ago

@cgillum, we now have an analyzer that creates a warning if customers are not using this workaround.

Is this fix still worth investing in (and then removing that analyzer)? Or are we happy just pointing customers in the right direction?

tg73 commented 2 years ago

@ConnorMcMahon I ran into this today. When you note that there is now an analyzer, that was not my experience (at least by typical expectations of a roslyn analyzer). The build output log shows warning DF0104, but this warning does not appear in the VS error list panel, and there is no squigly under the offending source code.

I'm using VS 17.2.2 and Microsoft.Azure.WebJobs.Extensions.DurableTask 2.7.1.

cgillum commented 2 years ago

Adding @amdeel (Connor no longer works at Microsoft).

@tg73 the behavior you're seeing could be related to your Visual Studio settings, but I think @amdeel can clarify.

amdeel commented 2 years ago

Hey @tg73. To take full advantage of the analyzers in VS you can make sure to turn them all on by turning on full solution analysis in VS. Turn this on by going to Tools -> Options -> Text Editor -> C# -> Advanced, under "Run background code analysis for:" select "Entire Solution." This should resolve your issue.