[API Proposal]: Add TaskAwaiter.OnCompleted() overload to support static delegates.

ShadedBlink commented 3 months ago

Background and motivation

We got new ValueTask types earlier in .Net Core to avoid unnecessary allocations. It covered cases for synchronously-completed tasks and reusable task states. Also we got a new IValueTaskSource interface to develop such reusable tasks. Now it is time to reduce number of allocated objects per task. Today await supports only single Awaiter.OnCompleted() implementation that requires an instance of Action per continuation. This requires us to have an Action allocated per instance of task. We can avoid such allocation if we expand IValueTaskSource.OnCompleted() way of continuation on Task type, i.e. static delegate with continuation state. Static delegate is supposed to be singleton, while asyncState should represent the parent AsyncStateMachine that awaits this task. According to the source code, Task creates a new object for continuation to capture contexts or saves initial continuation Action otherwise. Since a new object is created anyway, we can just append it with a new field of state. Such overload works totally fine with existing TaskAwaiter.OnCompleted(), since old overload now becomes just a wrapper to call new overload - it is exactly the same way ValueTaskAwaiter.OnCompleted works when created from IValueTaskSource.

API Proposal

In basic terms, it is enough to just copy IValueTaskSource.OnCompleted() method to TaskAwaiter class. It is up to dotnet developers on what to do with flags parameter, but I think it would be nice to keep signature closer to IValueTaskSource so ValueTaskAwaiter won't need some difficult logic to convert its requests to TaskAwaiter signature. This method should be public.

namespace System.Threading.Tasks;

public readonly struct TaskAwaiter<TResult> : ICriticalNotifyCompletion, ITaskAwaiter {
    public void OnCompleted(Action continuation); // Just replace with inline call to new method.
    public void UnsafeOnCompleted(Action continuation); // Just replace with inline call to new method.
+   public void OnCompleted(Action<object?> continuation, object? state, ValueTaskSourceOnCompletedFlags flags);
}

public readonly struct ValueTaskAwaiter<TResult> : ICriticalNotifyCompletion, ITaskAwaiter {
    public void OnCompleted(Action continuation); // Just replace with inline call to new method.
    public void UnsafeOnCompleted(Action continuation); // Just replace with inline call to new method.
+   public void OnCompleted(Action<object?> continuation, object? state, ValueTaskSourceOnCompletedFlags flags);
}

API Usage

In first place this API will work out-of-the-box since it changes internal implementation of task awaiters. Also this API now allows us to write proxy IValueTaskSource that can subscribe its external continuations to other tasks directly without need to allocate new delegate.

public class ProxyTask : IValueTaskSource {
    private readonly Task task;

    public void OnCompleted(Action<object?> continuation, object? state, short token, ValueTaskSourceOnCompletedFlags flags)
        => task.GetAwaiter().OnCompleted(continuation, state, flags);
}

Alternative Designs

No response

Risks

These changes don't interfere with any existing API so there are no breaking changes.

stephentoub commented 3 months ago

Can you please share a repro showing the allocation you're concerned about? If in your Task/ValueTask-returning async method the only thing you await are Task or ValueTask, no Action will be allocated to pass to OnCompleted because that mechanism gets bypassed. Are you seeing otherwise? In most async methods now then, there's at most one allocation for the entirety of the lifecycle, for the task itself if the method completes asynchronously.

ShadedBlink commented 3 months ago

@stephentoub Here is a heap snapshot showing dead objects count(alive count is 36 for Action and 2 for marked state machine box). Application is consequently reading HDD by 1MB batches, thus having thousands of long(not synchronously completed) tasks. You can see that number of Action instances is very close to number of AsyncStateMachineBox instances. Actually, every AsyncStateMachineBox has initialized MoveNextAction because of asynchronous completion. It is intended by current design, and it is exactly what I propose to change. Current OnCompleted() method doesn't allow us to go async without creation of such continuation actions. Heap snapshot You can mark that ValueTask allows us to use IValueTaskSource pooling, so I have to pool this task, but it is not global-applicable. More complex applications can have thousands async methods or async method chains including generics, We can't mark every ValueTask method with such builder attribute since it requires a lot of refactoring and memory to store cached states. On the contrary if we get gid of those excessive action allocations, it will impact every async method instantly since we cover different methods in same approach and don't need any allocations.

ShadedBlink commented 3 months ago

By the way, mentioned method returns ValueTask and awaits only ValueTask. No direct Task involvement.

ShadedBlink commented 3 months ago

@stephentoub I got a great reproduction code for you to test it out. Same case, 36 alive Action instances. There obviously would be only one alive box in any time. However, every box has initialized MoveNextAction. In both cases with previously mentioned method the Action count grows up constantly in same pace with boxes count. Default console app, net8.0, release configuration. Heap

    private static async ValueTask DummyMethodAsync() {
        await Task.Delay(1);
    }

    internal static async Task Main(string[] args) {
        while (true) {
            await DummyMethodAsync();
        }
        }

stephentoub commented 3 months ago

What are you using to profile? I suspect it's enabling ETW events for TPL, which will cause there to be extra allocations. What's the backtrace of those objects?

stephentoub commented 3 months ago

In https://devblogs.microsoft.com/dotnet/how-async-await-really-works/, search for "That was .NET Framework. This is .NET Core:", and you'll see a profile screenshot showing no Action allocations and an explanation for why.

ShadedBlink commented 3 months ago

Yes, it seems that visual studio debugger really was the reason why those actions were initialized. Yet it still doesn't cover IValueTaskSource case. I have a class Writer that is used for double buffered writes to hdd. It has two internal buffers. When somebody writes to Writer, all data is copied to first buffer, then, when first buffer is full, data is going to second buffer, while first buffer is being flushed. If second buffer is already being flushed, then write method has to wait before flush operation is completed. The Writer class is a IValueTaskSource, but it doesn't need internal async subscription logic - it always has to wait for(and only for) internal tasks that are created from RandomAccess.WriteAsync(). I can't passthrough external continuation to internal task since IValueTaskSource gets a static delegate with state, while public task API allows only task-specific continuations. Of course, I can create a new continuation Action per Writer instance, but it still looks totally excessive. We already have right working implementation in ValueTask, we just need to expand it to Task.

ShadedBlink commented 3 months ago

As you already mentioned, actual implementation is okay only with Task and ValueTask-returning implementation. What I ask you for is to allow IValueTaskSource implementations await such tasks without need for continuation Action as well.

davidfowl commented 3 months ago

A code sample of what you're doing and what you want to do in context would be great. I think I know what you're getting at with the proxy task scenario but a more spelled out code sample would make it crystal clear. Also, this does not feel like a mainstream scenario.

stephentoub commented 3 months ago

As you already mentioned, actual implementation is okay only with Task and ValueTask-returning implementation. What I ask you for is to allow IValueTaskSource implementations await such tasks without need for continuation Action as well.

Task has a single reference for storing a continuation, either directly or in a List of them. If a consumer were to supply a delegate and associated object in order to avoid allocating a wrapper, Task itself would then need to allocate the same wrapper.

ShadedBlink commented 3 months ago

@davidfowl , I am sorry, but I don't have public available code for that. @stephentoub , what about an interface like a IThreadPoolWorkItem? Let's say some IAsyncContinuation with single method void Execute(). Then IValueTaskSource can just implement such interface. Thus we still have single object, but don't need allocation of neither an Action nor any state wrappers. Task internally would store an object, which may be either Action or IAsyncContinuation. This approach is compatible for IValueTaskSource.OnCompleted() as well.

stephentoub commented 3 months ago

Can you please share a more complete sample of the code you're hoping to be able to write, including an implementation of said interface and what the consumer of it would look like?

ShadedBlink commented 3 months ago

Example code(user code):

internal class ProxyValueTaskSource : IValueTaskSource, IAsyncContinuation {
    private ValueTask task;
    private ManualResetValueTaskSourceCore<int> continuationStore;

    public ProxyValueTaskSource() {
        continuationStore = new ManualResetValueTaskSourceCore<int>() {
            RunContinuationsAsynchronously = false
        };
    }

    public void Init(ValueTask task) {
        continuationStore.Reset();
        this.task = task;
        // Some synchronous work that we want to perform when task completion is not required yet.
    }

    public void Execute() {
        continuationStore.SetResult(0); // Just invoke our external continuation inline.
    }
    public void GetResult(short token) {
        // Some synchronous work that we want to perform when task is completed.
        task.GetAwaiter().GetResult();
    }
    public ValueTaskSourceStatus GetStatus(short token)
        // Doesn't matter now.
        => task.IsCompletedSuccessfully ? ValueTaskSourceStatus.Succeeded : ValueTaskSourceStatus.Pending;
    public void OnCompleted(Action<object?> continuation, object? state, short token, ValueTaskSourceOnCompletedFlags flags) {
        // Store external continuation.
        continuationStore.OnCompleted(continuation, state, token, flags);
        // Subscribe to internal task.
        task.GetAwaiter().OnCompleted(this);
    }
}

Interface code(.NET code):

public interface IAsyncContinuation
{
    void Execute();
}

Expected changes to .NET:

Add overload TaskAwaiter.UnsafeOnCompleted(IAsyncContinuation continuation). It should work the same, i.e. just pass parameter to OnCompletedInternal.
Change OnCompletedInternal signature: Action continuation -> object continuation. object there is guaranteed to be either IAsyncContinuation or Action.
Change SetContinuationForAwait signature: Action continuationAction -> object continuationAction.
Update all context wrapper types to support both Action or IAsyncContinuation type by changing continuation type to object.
AddTaskContinuation and continuation store variables already declare object type for continuation type since there may be context wrappers as well as direct Action continuations.
Add case in RunContinuations for IAsyncContinuation.

dotnet / runtime