dotnet / docs

This repository contains .NET Documentation.
https://learn.microsoft.com/dotnet
Creative Commons Attribution 4.0 International
4.27k stars 5.89k forks source link

Q: WaitAny loop Caution - Time complexity #33521

Closed abarberenaCPDS closed 1 year ago

abarberenaCPDS commented 1 year ago

Hey guys πŸ‘‹ , @BillWagner @alexbuckgit @tdykstra @StephenCleary @stephentoub

Thanks for your article, was very helpful and easy to read. However, I have a question, per your Caution tip, which leads to Stephen Toub's analysis of a O(n2), if you use a while loop and then remove.

image

Wouldn't be preferable to use a CancellationTokenSource, pass in a CancellationToken as parameter and then issue a token.Cancel() once the first task completes? What are your thoughts? Is this a valid approach?

Thanks,

Abe

Say,

private async Task WaitingTheFirstTaskCompleted_UsingACancellationToken()
{
  // naive approach to declare a cancellation token
  CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();

  Task<int> t1 = FakeMethodAsync_Token(5, cancellationTokenSource.Token);
  Task<int> t2 = FakeMethodAsync_Token(2, cancellationTokenSource.Token);
  Task<int> t3 = FakeMethodAsync_Token(4, cancellationTokenSource.Token);

  // first: await the task
  Task<int> whichTask = await Task.WhenAny(t1, t2, t3);

  // second: issue a token cancellation
  cancellationTokenSource.Cancel();

  // inspect the result
  int theResult = await whichTask;
  Console.WriteLine($"This is the result: {theResult}");
}

// This could be a wrapper method to bind the task, the parameters and the token
private async Task<int> FakeMethodAsync_Token(int delay, CancellationToken cancellationToken)
{
  await Task.Delay(TimeSpan.FromSeconds(delay), cancellationToken);
  Console.WriteLine($"Completed: {delay}");
  return delay;
}

Document Details

⚠ Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

gfoidl commented 1 year ago

The example from the docs processes all tasks. With the while-loop-construct and Task.WhenAny an action is done when any of the tasks completes, and in order to process the remaining tasks it's looped over.

With your given example only one task is processed -- the others are cancelled (not taking into account if they're completed by accident). So that's a different use-case. The one from you example falls under the category "speculative invocation" is valid for this. I.e. start a bunch of task, wait for the first and cancel the others.


Tipp for your example: I'd await the other tasks too in a try-catch, exluding OperationCanceledException in order to be able to log unexpected errors, etc.

BillWagner commented 1 year ago

Thanks for the question @abarberenaCPDS

I agree with @gfoidl 's analysis on this different use case.

I'll close this as an answered question. If you have follow-up questions, reopen this issue and add more questions.

abarberenaCPDS commented 1 year ago

Thank you both for getting back, it's nice to have access to the authors so quick! @BillWagner @gfoidl

Nevermind, you're right, the title is clear, Processing as they complete != Processing the first to complete.

Now, what makes me cringe, and I am not saying it's bad, it's the running time of this approach. Maybe, in a scenario like the one explained in the seminal job of Aho, Ullman and Hopcroft, I would not mind to wait an hour for 1897 tasks to process as they complete in a O2.

image image

Would you say that if you modify the code per the recipe 2.6 as explained by @StephenCleary in his Concurrency Cookbook, behave as O(n) or would it still be O(n2)?

async Task<int> DelayAndReturnAsync(int value)
{
  await Task.Delay(TimeSpan.FromSeconds(value));
  return value;
}

async Task AwaitAndProcessAsync(Task<int> task)
{
  int result = await task;
  Trace.WriteLine(result);
}

// This method now prints "1", "2", and "3".
async Task ProcessTasksAsync()
{
  // Create a sequence of tasks.
  Task<int> taskA = DelayAndReturnAsync(2);
  Task<int> taskB = DelayAndReturnAsync(3);
  Task<int> taskC = DelayAndReturnAsync(1);
  Task<int>[] tasks = new[] { taskA, taskB, taskC };
  IEnumerable<Task> taskQuery =
  from t in tasks select AwaitAndProcessAsync(t);
  Task[] processingTasks = taskQuery.ToArray();
  // Await all processing to complete
  await Task.WhenAll(processingTasks);
}

Alternatively,

async Task<int> DelayAndReturnAsync(int value)
{
  await Task.Delay(TimeSpan.FromSeconds(value));
  return value;
}

// This method now prints "1", "2", and "3".
async Task ProcessTasksAsync()
{
  // Create a sequence of tasks.
  Task<int> taskA = DelayAndReturnAsync(2);
  Task<int> taskB = DelayAndReturnAsync(3);
  Task<int> taskC = DelayAndReturnAsync(1);
  Task<int>[] tasks = new[] { taskA, taskB, taskC };
  Task[] processingTasks = tasks.Select(async t =>
  {
    var result = await t;
    Trace.WriteLine(result);
  }).ToArray();

  // Await all processing to complete
  await Task.WhenAll(processingTasks);
}
gfoidl commented 1 year ago

The (last) examples w/ await Task.WhenAll is $O(n)$.

BTW:

- Task<int> DelayAndReturnAsync(int value);
+ Task<int> DelayAndReturnAsync(int seconds);

makes it self-documenting.

mind to wait an hour for 1897 tasks to process as they complete

I don't know your scenario, but by having that many tasks I'd look into producer / consumer pattern for processing, i.e. via Channels.