Open theodorzoulias opened 2 years ago
Tagging subscribers to this area: @dotnet/area-system-threading-tasks See info in area-owners.md if you want to be subscribed.
Author: | theodorzoulias |
---|---|
Assignees: | - |
Labels: | `area-System.Threading.Tasks`, `untriaged` |
Milestone: | - |
This is by design. A TaskScheduler-derived type can override MaxConcurrencyLevel to limit any degree of parallelism supplied to a Parallel loop, and the TaskScheduler returned from FromCurrentSynchronizationContext returns 1. If you need a different value, you can provide your own custom scheduler to do whatever you like. All that scheduler does is call SynchronizationContext.Post to execute the task.
Hi @stephentoub! Thanks for the quick feedback. I can understand the existing behavior for the synchronous Parallel.ForEach
API, but does it make sense for the asynchronous Parallel.ForEachAsync
too? My understanding is that this method is intended for mostly asynchronous workloads. An asynchronous operation that has been launched and is in-flight does not use threads, so why is the MaxDegreeOfParallelism
overruled by the MaxConcurrencyLevel
of the scheduler? Schedulers are all about threads. When it comes to asynchrony, they are pretty much unaware of what's happening above them.
Look at the TPL Dataflow for example, that AFAIK was async-enabled from the beginning. There is no such interference there.
My understanding is that this method is intended for mostly asynchronous workloads.
It's for workloads that involve async I/O, either in the retrieval of data to process or in the processing itself, but that doesn't mean it's dominated by I/O.
Look at the TPL Dataflow for example, that AFAIK was async-enabled from the beginning. There is no such interference there.
Plenty of what Dataflow does is schedule synchronous work. Not respecting MaximumConcurrencyLevel is a bug we long ago decided not to address.
does it make sense for the asynchronous Parallel.ForEachAsync too?
It's a trivial change, basically in this line (and one other):
https://github.com/dotnet/runtime/blob/844b0999fde67cdbaa7acb0f398ea1b066a2b8c7/src/libraries/System.Threading.Tasks.Parallel/src/System/Threading/Tasks/Parallel.ForEachAsync.cs#L55
the parallelOptions.EffectiveMaxConcurrencyLevel
would be changed to parallelOptions.MaxDegreeOfParallelism
. I'm just not sure it's the right change, as it could end up inundating the scheduler (the UI thread in your case) with lots of work items due to unnecessary partitioning.
I don't feel strongly about the design decision, but what it does today is consistent with the rest of the Parallel APIs, and if it's really an issue for you, as I mentioned you can easily avoid the issue with your own scheduler, e.g.
internal sealed class SyncCtxScheduler : TaskScheduler
{
private readonly SynchronizationContext _ctx;
public SyncCtxScheduler(SynchronizationContext ctx) => _ctx = ctx;
protected override void QueueTask(Task task) => _ctx.Post(s => TryExecuteTask((Task)s), task);
protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued) => SynchronizationContext.Current == _ctx ? TryExecuteTask(task) : false;
protected override IEnumerable<Task>? GetScheduledTasks() => null;
}
Plenty of what Dataflow does is schedule synchronous work. Not respecting MaximumConcurrencyLevel is a bug we long ago decided not to address.
@stephentoub interesting. Personally I have used to my advantage that I can configure independently the MaxDegreeOfParallelism
and MaximumConcurrencyLevel
in the TPL Dataflow, in at least two occasions. It didn't pass through my mind that I was exploiting a bug!
I don't feel strongly about the design decision, but what it does today is consistent with the rest of the Parallel APIs, and if it's really an issue for you, as I mentioned you can easily avoid the issue with your own scheduler, e.g.
The point is that when we talk about the synchronous Parallel
APIs, choosing between the two designs doesn't matter. The scheduler itself will enforce its de facto concurrency level, and the MaxDegreeOfParallelism
being either equal or larger will not make any difference. That's because everything runs on the scheduler's threads. There is no asynchronous dimension in that case.
As a proof I used the SyncCtxScheduler
class in a demo similar to the original, switching only from Parallel.ForEachAsync
to Parallel.ForEach
. Here is the demo, and here is the output:
Although the SyncCtxScheduler
fakes an unlimited concurrency level, the processing is serialized because it's happening on the UI thread.
Are there any scenarios that I am missing, where changing the design would affect the synchronous Parallel
APIs?
Description
Hi! I am trying to run a
Parallel.ForEachAsync
loop on the UI thread of a WinForms application, in order to update aTextBox
as part of the asynchronous operation. For this purpose I am configuring the loop with aTaskScheduler.FromCurrentSynchronizationContext
scheduler, as suggested in this comment (by @stephentoub):Unfortunately by doing so I witnessed that the level of concurrency of the asynchronous operations is reduced to one. My
ParallelOptions.MaxDegreeOfParallelism
configuration is effectively overruled.Reproduction Steps
Online demo.
Expected behavior
Two items should be processed concurrently at the same time.
Actual behavior
The items are processed sequentially, one item at a time.
Regression?
The
Parallel.ForEachAsync
API was introduced in the latest .NET release (.NET 6). So no regression.Known Workarounds
Don't use the
ParallelOptions.TaskScheduler
, and create new tasks withTask.Factory.StartNew
inside thebody
delegate.Configuration
Other information
In the TPL Dataflow library, the
ExecutionDataflowBlockOptions
class has alsoTaskScheduler
andMaxDegreeOfParallelism
properties, that according to my experiments are behaving as expected (they don't interfere with each other).Correcting this behavior will also allow to configure independently the maximum concurrency (number of async operations) and the maximum parallelism (number of threads), of a parallel loop. The number of threads will be configurable by setting the
ParallelOptions.TaskScheduler
to an instance of theConcurrentExclusiveSchedulerPair
class:For this to work, there should be no
.ConfigureAwait(false)
inside thebody
delegate.