dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

Add awaitable ThreadPool.SwitchTo() #20025

Open davidfowl opened 7 years ago

davidfowl commented 7 years ago

Task.Run breaks the execution flow by forcing the code into a delegate, it's basically callback hell (since it's the first callback in the chain). Instead, for the situations where we want to stay in the await pattern, ThreadPool.Yield() would be a nice to have API. It would turn code that looks like this:

await Task.Run(() => DoSomething());
await ThreadPool.Yield();

// Now on a thread pool thread so we can call some user code
DoSomething();

/cc @stephentoub

omariom commented 7 years ago

SwitchTo?

davidfowl commented 7 years ago

Was keeping in line with Task.Yield()

stephentoub commented 7 years ago

I agree this is a pretty reasonable thing to add. In the original async/await CTP, we actually had methods for doing this, IIRC as extensions on a TaskScheduler and SynchronizationContext, e.g.

await TaskScheduler.Default.SwitchTo(); // continuation will be queued to the thread pool

We removed these because you couldn't use an await inside of a catch/finally, which meant it was difficult to return to the original context if you were trying to do something structured like:

var origScheduler = TaskScheduler.Current;
await TaskScheduler.Default.SwitchTo();
try
{
    ... // runs on thread pool
}
finally { await origScheduler; }

and as a result you could end up in situations where error related logic would run on the "wrong" context; in a sense we were concerned we were adding APIs that would lead to dangerous situations. However, C# has since gained the ability to have awaits in catch/finally blocks, making that largely a moot argument.

I would personally also err on the side of using a name more like SwitchTo. The Yield naming was intended to suggest yielding and coming back to the current context, so using it to mean "go to that other context" might be a little confusing.

davidfowl commented 7 years ago

SwitchTo sounds fine to me.

However, C# has since gained the ability to have awaits in catch/finally blocks, making that largely a moot argument.

If we ever get async disposable maybe that pattern could be more natural.

await using (TaskScheduler.Current.Preserve())
{
    await TaskScheduler.Default.SwitchTo();
    // Do thread pool stuff
} // Revert to the original scheduler
benaadams commented 7 years ago

Should it check if already on threadpool/scheduler and immediately complete if so?

stephentoub commented 7 years ago

Should it check if already on threadpool/scheduler and immediately complete if so?

It would depend on your scenario. For example, if you were using it to ensure that a caller of your async API wasn't blocked while doing some long computation, then you'd want it to always queue. But if you were using it to ensure you were running on a context that owned done resources only accessible to that context, you wouldn't care and for perf might want it to nop.

noseratio commented 4 years ago

@davidfowl, now we have IAsyncDisposable and this still looks nice, indeed:

await using (TaskScheduler.Current.Preserve())
{
    await TaskScheduler.Default.SwitchTo();
    // Do thread pool stuff
} // Revert to the original scheduler

But I feel it is no different from:

await Task.Run(async() => {
    // do thread pool stuff 
    // (or async stuff without sync context)
}); // Revert to the original sync context - without ConfigireAwait(false)

Is there any subtle difference?

stephentoub commented 4 years ago

Is there any subtle difference?

Potentially perf, but functionally I'd expect them to be the same. That said, the former doesn't exist, so it's hard to say for sure.

supervos commented 4 years ago

Wouldn't it be possible to have the SwitchTo to pass in your TaskScheduler. The return value is an async dispobable that contains the previous TaskScheduler.

A use case is where you have a limited access to a resource like a database connection or external API service. Together with the async disposable, you can create something like the following:

private TaskScheduler _limitedResource = new LimitedTaskScheduler(10);

await using (Task.SwitchTo(_limitedResource))
{
  // Use resource with limited concurrent access
}

Placing the SwitchTo on the Task would make it more visible for people who are not aware of the existence of the TaskScheduler.

An overload to this method can accept a boolean indicating whether you must always switch or only if you are running on a different context than the provided one.

jnm2 commented 4 years ago

Could the API take a cancellation token? Making the SwitchTo implementation aware of cancellation could avoid queuing work on the thread pool when already canceled.

It would be a single call, replacing:

cancellationToken.ThrowIfCancellationRequested();
await TaskScheduler.Default.SwitchTo();
cancellationToken.ThrowIfCancellationRequested();

Usage:

public static async IAsyncEnumerable<string> EnumerateFilesAsync(
    string path,
    string searchPattern,
    SearchOption searchOption,
    [EnumeratorCancellation] CancellationToken cancellationToken)
{
    await TaskScheduler.Default.SwitchTo(cancellationToken);

    foreach (var file in Directory.EnumerateFiles(path, searchPattern, searchOption))
    {
        yield return file;

        await TaskScheduler.Default.SwitchTo(cancellationToken);
    }
}

Maybe not relevant to how .NET Core would implement it, but this implementation actually passes the cancellation token to TaskFactory.StartNew when switching to a non-default task scheduler: https://gist.github.com/jnm2/46a642d2c9f2794ece0b095ba3d96270

omariom commented 4 years ago

@noseratio

await using (TaskScheduler.Current.Preserve())
{
    await TaskScheduler.Default.SwitchTo();
    // Do thread pool stuff
} // Revert to the original scheduler

It could be simplified to

await using (await TaskScheduler.Default.SwitchTo())
{
}
jnm2 commented 4 years ago

I don't like having await TaskScheduler.Default.SwitchTo() produce an IAsyncDisposable. It's much rarer that I'll want to revert to the original scheduler. In the 99% case, I'll get IDE warnings that I'm not disposing a disposable object.

omariom commented 4 years ago

It must implement it via async disposable pattern, with DisposeAsync returning a struct instead of ValueTask.

jnm2 commented 4 years ago

The IDE will (should!) still warn when a type follows the async disposable pattern and is not disposed, and I wouldn't like that for this API.

Maybe _ = await TaskScheduler.Default.SwitchTo(); would be okay as a way to silence, but the name SwitchTo doesn't make it obvious what is being discarded. And we're making the 99% case less optimal for the sake of the 1%. It seems more fitting for the rarer case to use the additional .Preserve() call. On top of that, the name Preserve is significantly more self-documenting

noseratio commented 4 years ago

Is TaskSheduler.SwitchTo still planned for 6.0?

It'd be really useful where we deal we a 3rd party code that we don't want to execute on any custom synchronization context. E.g., I really like the simplicity of the first option from the below. IMO, it clearly indicates the intent, with minimum boilerplate code, and it's also free of some corner cases (unlike the last option):

// switch to the thread pool explicitly for the rest of the async method
await TaskScheduler.Default.SwitchTo();
await RunOneWorkflowAsync();
await RunAnotherWorkflowAsync();

// execute RunOneWorkflowAsync on the thread pool 
// and stay there for the rest of the async method
await Task.Run(RunOneWorkflowAsync).ConfigureAwait(false);
await RunAnotherWorkflowAsync();

await Task.Run(async () => 
{
  // start on the thread pool
  await RunOneWorkflowAsync();
  await RunAnotherWorkflowAsync();
}).ConfigureAwait(false);
// continue on the thread pool for the rest of the async method

// start on whatever the current synchronization context is
await RunOneWorkflowAsync().ConfigureAwait(false);
// continue on the thread pool for the rest of the async method,
// unless everything inside `RunOneWorkflowAsync` has completed synchronously
await RunAnotherWorkflowAsync();

I have an experimental implementation of SwitchTo as an extension method (gist), it needs some work and tests and I'd be happy to contribute that as a PR for 6.0.

AArnott commented 4 years ago

@noseratio the syntax you want already works if you reference the Microsoft.VisualStudio.Threading Nuget package.

noseratio commented 4 years ago

the syntax you want already works if you reference the Microsoft.VisualStudio.Threading Nuget package.

@AArnott thanks for the pointer, didn't know that had also been open-sourced! 👍

It's good to see the alwaysYield param there as well, mindlike! Do you think supressing ExecutionContext flow for non-default task schedulers is an overkill?

I'm curious to know if this SwitchTo pattern get used a lot in Visual Studio itself. I generally like it much more than async lambda to Task.Run. The latter IMHO should only be used for synchronous lambdas which do some CPU-bound work.

AArnott commented 4 years ago

I don't know why I'd want to suppress ExecutionContext flow. But I'm open to learning. The whole distinction between safe and unsafe awaiters is one I'm not particularly strong on.

We only explicitly call SwitchTo when we want to use alwaysYield: true. Otherwise we just await on the TaskScheduler directly (since we made that awaitable as well). And yes, we generally prefer await TaskScheduler.Default; over await Task.Run because not only the code is cleaner, but we avoid allocating another closure, delegate, and async state machine since the calling method already has all those that can be reused. However, using await taskScheduler causes us to lose our caller's context. So if we want to return to our caller's context (e.g. the main thread) then using await Task.Run is an easy way to do some amount of work on the threadpool but then return to our caller's context when we're done.

jnm2 commented 4 years ago

I don't know why I'd want to suppress ExecutionContext flow. But I'm open to learning. The whole distinction between safe and unsafe awaiters is one I'm not particularly strong on.

This explained things for me: https://devblogs.microsoft.com/pfxteam/whats-new-for-parallelism-in-net-4-5-beta/ Ctrl+F for For those of you familiar with ExecutionContext.

davidfowl commented 4 years ago

I don't know why you'd want to suppress the execution context flow as part of this either. Seems entirely unrelated here. Maybe there's some common scenario I'm missing?

AArnott commented 4 years ago

I think we could do a better job of suppressing ExecutionFlow in some of our awaiters: https://github.com/microsoft/vs-threading/pull/689

noseratio commented 4 years ago

I don't know why you'd want to suppress the execution context flow as part of this either. Seems entirely unrelated here. Maybe there's some common scenario I'm missing?

@davidfowl and @AArnott I might be wrong, but I think it might be a sensible optimization for ICriticalNotifyCompletion.UnsafeOnCompleted. We don't have to flow execution context there, but Task.Factory.StartNew would still do that, adding a bit of unneeded overhead. I believe suppressing the flow explicitly tells Task.Factory.StartNew to not do that.

This is not a concern for TaskScheduler.Default, for which can just use ThreadPool.UnsafeQueueUserWorkItem.

noseratio commented 4 years ago

And yes, we generally prefer await TaskScheduler.Default; over await Task.Run because not only the code is cleaner, but we avoid allocating another closure, delegate, and async state machine since the calling method already has all those that can be reused. However, using await taskScheduler causes us to lose our caller's context.

That makes sense. I usually use await TaskScheduler.Default at the beginning of the methods that don't touch the UI or ViewModel at all.

stephentoub commented 4 years ago

I think it might be a sensible optimization for ICriticalNotifyCompletion.UnsafeOnCompleted. We don't have to flow execution context there, but Task.Factory.StartNew would still do that, adding a bit of unneeded overhead

See https://github.com/microsoft/vs-threading/pull/689#discussion_r497711508. Someone could do the appropriate measurements, but my guess is that for .NET Core this would actually be a net negative rather than positive.

noseratio commented 4 years ago

See microsoft/vs-threading#689 (comment). Someone could do the appropriate measurements, but my guess is that for .NET Core this would actually be a net negative rather than positive.

That comment is a great insight, thank you. I suppose it explains why there so few uses of SuppressFlow in .NET Core.

stephentoub commented 4 years ago

I suppose it explains why there so few uses of SuppressFlow in .NET Core.

Yes, it's use now isn't about throughout or allocation, but entirely about object lifetime, i.e. ensuring that ExecutionContext isn't captured unnecessarily into something that may live for a long time, that doesn't need access to the context (and won't be calling unknown code that might), and thus shouldn't forcibly extend the lifetime of values stored into async locals captured by the EC.

theodorzoulias commented 1 year ago

I've seen occasionally the await Task.Yield() being used with the explicit intention of switching to the ThreadPool (example). This does work, but it depends on the ambient SynchronizationContext.Current and TaskScheduler.Current being null, so I consider it to be a hack and not a proper use of this API. But it's hard to argue against something that does work, it is readily available, and it's easy to discover, use and remember. So I think that there is some merit in the alternative proposal by @AArnott of adding ConfigureAwait support to the Task.Yield:

await Task.Yield().ConfigureAwait(false);

It would have identical functionality with the proposed here:

await TaskScheduler.Default.SwitchTo(alwaysYield: true);

The first line is more concise and, to me, more familiar. But to be honest both are lacking the critical component: ThreadPool. My brain has to read between the lines in order to get the real meaning: "jump to the ThreadPool".

murshex commented 1 year ago

In .NET 8.0 we can do:

await Task.CompletedTask.ConfigureAwait(ConfigureAwaitOptions.ForceYielding);

But its pretty verbose. A short and concise helper method like this would be nice to have:

await Task.YieldNoContext();
Neme12 commented 7 months ago

I would just add Tak.Yield().ConfigureAwait(bool) anyway, it's confusing that it isn't there and it's confusing in a library (where I have to use ConfigureAwait(false) everywhere) whether it means I shouldn't be using Task.Yield() at all given that I can't set ConfigureAwait(false), or if I should use it anyway and just live with the inconsistency and with the wart of missing it in this one place.

Maybe only the old-school boolean overload though, ConfigureAwaitOptions doesn't really make sense on Task.Yield() since it includes ConfigureAwaitOptions.ForceYielding and ConfigureAwaitOptions.SuppressThrowing

Or maybe just Task.Yield(bool continueOnCapturedContext) directly without an additional method call and Awaitable.

Neme12 commented 7 months ago

TBH I think we should have both. Both Task.Yield().ConfigureAwait(bool) and a way to explicitly switch to the thread pool or to the main thread, like https://github.com/microsoft/vs-threading offers (but I would recommend this library to any UI developer anyway).

With vs-threading, you can switch to any TaskScheduler by awaiting it, so commonly it would be await TaskScheduler.Default; This seems like a feature that should be in-box, along with the Task.Yield() thing.