Allow Exception to bubble out of IAsyncStateMachine.MoveNext when not called from a continuation

ificator commented 3 years ago

Version Used: NET48 (although it would appear to be the same in NET60)

Our team implements a web service that builds upon ASP.NET, and has an async framework that abstracts away logic common to all requests. A lot of the API implementations end up being synchronous in nature, and in these cases we've noticed that throwing exceptions is unexpectedly expensive.

We decided to benchmark to see if we could get a handle on what's going on:

[SimpleJob(RuntimeMoniker.Net48)]
[MemoryDiagnoser]
[CsvExporter]
public class Benchmark
{
    private int exceptionCount = 0;
    private int iterationCount = 0;

    public Benchmark()
    {
        this.Depth = 100;

        AppDomain.CurrentDomain.FirstChanceException += (s, a) =>
        {
            this.exceptionCount++;
        };
    }

    public int Depth { get; set; }

    [GlobalCleanup]
    public void GlobalCleanup()
    {
        Console.WriteLine("Exceptions: {0}", (double)this.exceptionCount / this.iterationCount);
    }

    [Benchmark]
    public void AsynchronousException()
    {
        this.iterationCount++;

        try
        {
            this.AsynchronousImpl(shouldThrow: true, 1).Wait();
        }
        catch
        {
        }
    }

    [Benchmark]
    public void SynchronousException()
    {
        this.iterationCount++;

        try
        {
            this.SynchronousImpl(shouldThrow: true, 1);
        }
        catch
        {
        }
    }

    private async Task<int> AsynchronousImpl(bool shouldThrow, int currentDepth)
    {
        if (currentDepth < this.Depth)
        {
            return await this.AsynchronousImpl(shouldThrow, currentDepth + 1) + 1;
        }
        else if (shouldThrow)
        {
            throw new Exception("test");
        }
        else
        {
            return 1;
        }
    }

    private int SynchronousImpl(bool shouldThrow, int currentDepth)
    {
        if (currentDepth < this.Depth)
        {
            return this.SynchronousImpl(shouldThrow, currentDepth + 1) + 1;
        }
        else if (shouldThrow)
        {
            throw new Exception("test");
        }
        else
        {
            return 1;
        }
    }
}

The results were unexpected:

|                Method |        Mean |     Error |    StdDev |    Gen 0 |  Gen 1 | Allocated | First-Chance Exceptions |
|---------------------- |------------:|----------:|----------:|---------:|-------:|----------:|------------------------:|
| AsynchronousException | 1,153.77 μs | 22.862 μs | 24.462 μs | 138.6719 | 7.8125 |    718 KB |                     101 |
|  SynchronousException |    44.48 μs |  0.520 μs |  0.461 μs |   1.1597 |      - |      6 KB |                       1 |

Now obviously we expected some overhead because of the async/await machinery, but these seems excessive. Drilling in deeper this behavior is primarily driven by the fact that a Task is returned in all cases, but it seems as though this is unnecessary in the case where an async method fails before it actually goes async.

Contractually speaking await Foo(); is expected to throw an Exception if relevant, and so whether it bubbles out of MoveNext or is indirected through a Task seems like an implementation detail that we can change (although there's a potential conflict with https://github.com/dotnet/runtime/issues/22144). So running with that I experimented with modifying the state machine so that:

It evaluated, at the start of MoveNext, whether it's being invoked from a continuation (isContinuation = _state > -1)
Use that value as an exception filter (catch (Exception ex) when (isContinuation))

The results were definitely promising:

|                      Method |        Mean |     Error |   StdDev |    Gen 0 |  Gen 1 | Allocated | First-Chance Exceptions |
|---------------------------- |------------:|----------:|---------:|---------:|-------:|----------:|------------------------:|
|       AsynchronousException | 1,111.23 μs | 10.172 μs | 7.941 μs | 138.6719 | 7.8125 |    718 KB |                     101 |
| CustomAsynchronousException |   163.89 μs |  1.256 μs | 1.049 μs |   2.1973 |      - |     12 KB |                       1 |
|        SynchronousException |    44.29 μs |  0.380 μs | 0.337 μs |   1.1597 |      - |      6 KB |                       1 |

Execution time is reduced to ~15%, while memory allocation is reduced to ~2%.

Are there any downsides to this approach that I, in my naivety, am overlooking? I'd be happy to try my handle at a PR to implement this behavior, but I figured it was safer to hold off in case this suggestion is shot down.

dotnet-issue-labeler[bot] commented 3 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

CyrusNajmabadi commented 3 years ago

@stephentoub

stephentoub commented 3 years ago

Are there any downsides to this approach that I, in my naivety, am overlooking?

e.g.

Task t1 = M1Async();
Task t2 = M2Async();
await Task.WhenAll(t1, t2);

If M2Async starts propagating exceptions synchronously, you'll never join with t1.

More generally, because an await may yield or not based on a race condition as to whether the awaited thing is completed or not at that moment, whether the exception emerges synchronously or asynchronously would end up becoming non-deterministic, which makes consumption challenging for almost any scenario other than directly/immediately awaiting the returned task.

ificator commented 3 years ago

Ah yes - I had some tunnel vision on how tasks are used with await. What I propose only works, as you said, if every task is immediately awaited.

Realistically whether M2Async throws or returns a failed task is a caller decision, and NOT something the M2Async state machine can handle. The generation of the caller's state machine could know that it has some unawaited tasks and do something the following, but that gets ugly pretty quickly (not to mention the complexity of detecting whether it's necessary):

private void MoveNext()
{
    int num = <>1__state;
    try
    {
        TaskAwaiter awaiter;
        if (num != 0)
        {
            Task t1 = null;
                        try
                        {
                            t1 = M1Async();
                        }
                        catch (Exception ex)
                        {
                            // t1 = failed task from ex
                        }

            Task t2 = null;
                        try
                        {
                            t2 = M2Async();
                        }
                        catch (Exception ex)
                        {
                            // t2 = failed task from ex
                        }

            awaiter = Task.WhenAll(t1, t2).GetAwaiter();
            if (!awaiter.IsCompleted)
            {
                num = (<>1__state = 0);
                <>u__1 = awaiter;
                <>t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                return;
            }
        }
        else
        {
            awaiter = <>u__1;
            <>u__1 = default(TaskAwaiter);
            num = (<>1__state = -1);
        }
        awaiter.GetResult();
    }
    catch (Exception exception)
    {
        <>1__state = -2;
        <>t__builder.SetException(exception);
        return;
    }
    <>1__state = -2;
    <>t__builder.SetResult();
}

Some of our most thrown exceptions are only reaching the level they are due to amplification by the catch+rethrow in the generated state machines, so are there any options to improve performance here?

CyrusNajmabadi commented 3 years ago

How often are you throwing exceptions such that this sort of perf is that visible? Would it be possible to redesign such that these are not considered exceptional cases, but are instead considered normal, with values that just propagate up the stack through normal returning?

ificator commented 3 years ago

The exceptions in question are for things like "access denied" and "resource not found" for a webapi so there isn't much to do. They really are exceptional circumstances, and I'd rather not the solution be to return to the HResult-style patterns of the past.

To be fair we haven't measured a performance issue yet, rather we're just seeing an unexpected number of exceptions being thrown and we ideally like to minimize those as much as possible. In the case of async there isn't really much that can be done if synchronous propagation is out of the picture though, which is unfortunate.

About the only way I could see it working would be something like:

When an "async" method is converted to the state machine another implementation of the method is generated with a name that guarantees there won't be a conflict (like what's done for the state machine class/struct itself), but with another parameter that's along the lines of allowSynchronousExceptionPropagation:

private Task CustomAsynchronousImpl(int currentDepth)
{
 CustomAsynchronousImplStateMachine stateMachine = new CustomAsynchronousImplStateMachine();
 stateMachine._builder = AsyncTaskMethodBuilder.Create();
 stateMachine._this = this;
 stateMachine._state = -1;
 stateMachine.currentDepth = currentDepth;
 stateMachine._builder.Start(ref stateMachine);
 return stateMachine._builder.Task;
}

private Task <PSE>CustomAsynchronousImpl(int currentDepth)
{
 CustomAsynchronousImplStateMachine stateMachine = new CustomAsynchronousImplStateMachine();
 stateMachine._builder = AsyncTaskMethodBuilder.Create();
 stateMachine._this = this;
 stateMachine._state = -1;
 stateMachine._propagateSynchronousExceptions = true;
 stateMachine.currentDepth = currentDepth;
 stateMachine._builder.Start(ref stateMachine);
 return stateMachine._builder.Task;
}

The state machine is generated such that the value of _propagateSynchronousExceptions factors in to the exception filter:
```
catch (Exception ex) when (isContinuation || !_propagateSynchronousExceptions)
```
The state machine is generated such that when a task is awaited it uses the generated method instead of the rewritten one:
```
awaiter = benchmark.<PSE>CustomAsynchronousImpl(currentDepth + 1).GetAwaiter();
```
In the case where there isn't one (e.g. it wasn't written as async) it'll simply call the original and have the existing behavior:
```
Task t1 = M1Async();
Task t2 = M2Async();
awaiter = Task.WhenAll(t1, t2).GetAwaiter();
```

Now of course, whether that complexity is warranted will depend on how common a scenario this really is. For us, since we have a lot of synchronous code written in an asynchonous framework it's pretty common indeed, but are we a representative case?

CyrusNajmabadi commented 3 years ago

The exceptions in question are for things like "access denied" and "resource not found"

To be fair we haven't measured a performance issue yet, rather we're just seeing an unexpected number of exceptions being thrown

What's going on in your webservice such that lots of access denieds are happening? For resource-not-found, can you check for hte existence of the resource first if this really is something that occurs a lot?

ghost commented 3 years ago

Closing this issue as we've seen no reply to the request for more information. If you are able to get the requested information, please add it to the issue and we will retriage it.

dotnet / roslyn

Allow Exception to bubble out of IAsyncStateMachine.MoveNext when not called from a continuation #56036