davidfowl / AspNetCoreDiagnosticScenarios

This repository has examples of broken patterns in ASP.NET Core applications
7.96k stars 752 forks source link

Make TaskCreationOptions.LongRunning advice more explicit #64

Open bartelink opened 4 years ago

bartelink commented 4 years ago

There's a note regarding avoidance of TaskCreationOptions.LongRunning:

💡 NOTE:Task.Factory.StartNew has an option TaskCreationOptions.LongRunning that under the covers creates a new thread and returns a Task that represents the execution. Using this properly requires several non-obvious parameters to be passed in to get the right behavior on all platforms.

The code sample then proceeds to use:

var thread = new Thread(ProcessQueue) 
    {
        // This is important as it allows the process to exit while this thread is running
        IsBackground = true
    };
    thread.Start();

The bit I'm missing is how we go from "API is awkward" to "look, you're better off with Thread.Start"; is it for reasons of cumbersome syntax or is it because it's outright better and/or TaskCreationOptions.LongRunning should all-but be deprecated that we go to the lengths of actually creating a thread explicitly? I propose to use the answer to either a) make a PR to switch the example to use Task.Factory.StartNew, or b) clarify the note to explicitly indicate that Task.Factory.StartNew should be avoided, and (Thread {IsBackground = true}).Start() is the recommended alternative

For instance in Serilog.Sinks.Async, we create the thread using TaskCreationOptions.LongRunning

davidfowl commented 4 years ago

We should show the usage of Task.Factory.StartNew as well.

alefranz commented 4 years ago

My interpretation of that guidance is that given StartNewTask.Factory.StartNew is under the hood creating a background thread, you are better off doing it directly so that you have to explicitly decide how to handle the scenario when you are running on a platform that doesn't support starting new threads, instead of having your application silently behaving unexpectedly.

https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/ThreadPoolTaskScheduler.cs#L45-L51

Was this the intention?

davidfowl commented 4 years ago

Yes

mosdav commented 3 years ago

Great writing. So in light of the advice to go w/ good-old plain Thread for long running "flows", is there a use case at all to use async code in do-forever/consumer like Tasks ? For example, is it "OK" to use Stephen's Channel inside Task.Run or should we go w/ smth like BlockingQueue and avoid async all together ? IMO, there is an ambiguity here, we should address

LBensman commented 3 years ago

Instead, spawn a new thread manually to do long running blocking work.

I'm not sure if I completely agree with this advice, in that it's not quite functionally the equivalent of .LongRunning option. The example that is given is indeed a good one for when to use own thread instead of .LongRunning. But allow me a different example where use of .LongRunning would be well justified (IMHO) over own thread:

var citiesPointsOfInterest = new [] {
    LoadDataset(NewYork).ToArray(),  // 250 locations; .ToArray() <-- it's in memory so no task discontinuation on await IO.
    LoadDataset(Paris).ToArray(),  // 400 locations;
    LoadDataset(Moscow).ToArray(),  // 100 locations; 
};

var bestPathThroughCitiesTasks = citiesPointsOfInterest.Select(city =>
    Task.Factory.StartNew(ComputeBestPathThroughPointsOfInterest,  // Oh that wonderful The Traveling Salesman Problem...
        city, TaskCreationOptions.LongRunning // LongRunning here would create new thread so as not to pressure thread pool.
    )).ToArray();

var firstDoneTask = await Task.WhenAny(bestPathThroughCitiesTasks);

SendUserFirstResultToThinkAbout(firstDoneTask.Result);

await Task.WhenAll(bestPathThroughCitiesTasks);

LetThemHaveItAll(bestPathThroughCitiesTasks.Select(t => t.Result));

The point of the above example is that it uses flexibility, expressiveness and ecosystem of TPL where use of own threads, while can certainly be done, would not be nearly as convenient and simple. Such code would require a mechanism where a thread notifies main code that result is ready, main code has to manage threads and listen for completions so as to know where to pick up first result, join thread back, then wait for the rest. Again, can be done, but far less convenient.

And yet the above will run on separate threads doing long computations while perfectly protecting the thread pool and avoiding the problem described in the advice.

I'd suggest amending the advice to make it clear when to use .LongRunning and when to use own thread, as both are valid in their own specific scenarios.

P.S. I think this confusion comes from people understanding "LongRunning" as wallclock time, rather than long contiguous block of operation that won't result in multiple subtasks reducing the computation a bunch of "ShortRunning" computations. If that understanding is there, then when to use .LongRunning becomes clearer.

davidfowl commented 3 years ago

Feel free to send a PR 😄 , I don't disagree.