dotnet / dotNext

Next generation API for .NET
https://dotnet.github.io/dotNext/
MIT License
1.62k stars 121 forks source link

Added AsyncBatchedExecutor for execute tasks in parallel with batch constraints #117

Closed Tassadar2499 closed 2 years ago

Tassadar2499 commented 2 years ago

Added AsyncBatchedExecutor for execute tasks in parallel with batch constraints. Maybe you will find a better way for modify this algo. I will really appreciate your help with review this code. Yes I know about Parallel.ForEachAsync, but Parallel.ForEachAsync realization uses locks, in opposite my realization is lock free.

sakno commented 2 years ago

PR has 12-month old commits.

I think that the same behavior can be achieved with Parallel.ForEachAsync from .NET standard library.

Tassadar2499 commented 2 years ago

Yes I know about Parallel.ForEachAsync, but Parallel.ForEachAsync realization uses locks, in opposite this realization is lock free.

Tassadar2499 commented 2 years ago

But changes in files show correct. Can I squash this commits?

sakno commented 2 years ago

No, you need to pick only required changes and repeat them on top of develop branch. All these commits are from master branch.

Tassadar2499 commented 2 years ago

Ok, thanks. I will try it later.

sakno commented 2 years ago

@Tassadar2499 , I think that lock-free nature of the proposed implementation is a bit incorrect. Task.WhenAll as well as Task use locks internally. Additionally, batch processing reduces throughput of parallel execution in contrast to Parallel class. For instance, we have 6 items in the collection and batch count is 3. Assuming that processing time of each item is different (that is not so uncommon): 1) I1 = 5ms, I2 = 20ms, I3 = 6ms 2) I4 = 3ms, I5 = 25ms, I6 = 5ms

Total time for each batch is Max(t(I1),... t(In)). Thus, the total processing time is 20+25=45ms

Now execution steps of Parallel with parallelism equal to 3: 1) I1 = 5ms, I2 = 20ms, I3 = 6ms 2) I4 = 3ms, I2 = 15ms, I3 = 1ms (5ms) 3) I4 = 2ms, I2 = 14ms, I5 = 25ms (1ms) 4) I6 = 5ms, I2 = 12ms, I5 = 23ms (2ms) 5) I2 = 7ms, I5=18ms (5ms) 6) I5 = 11ms (7ms) 7) Total execution time is 5+1+2+5+7+11 = 31ms

As you can see, Parallel is 32% faster. This happens because batch processing cannot take a new work item from the next batch.

Task.WhenAll doesn't provide a way to process a completed task in the batch. If you want to process tasks as they complete you can take a look at TaskCompletionPipe<T> class provided by .NEXT Threading library.