madelson / MedallionShell

A .NET library simplifying the execution and chaining of processes
MIT License
415 stars 32 forks source link

Redirecting stdout/stderr blocks a thread pool thread for each call #94

Open drauch opened 1 year ago

drauch commented 1 year ago

FYI: I've tested https://github.com/dotnet/runtime/issues/81896 with your library and unfortunately you're also affected.

Best regards, D.R.

madelson commented 1 year ago

@drauch interesting; thanks for letting me know. It sounds like there is some discussion around the framework using named rather than anonymous pipes which presumably would fix it.

Until then/if no action is taken, would you find it valuable if MedallionShell were to specify TaskOptions.LongRunning as per one of the other suggestions in that thread?

drauch commented 1 year ago

Thank you for the offer, it would be great, but we unfortunately can't use MedallionShell for more than this single reason. We use a custom made solution also because of:

Best regards, D.R.

madelson commented 1 year ago

Thanks @drauch . If you don't mind, I'd love to understand a bit more about your asks in case I want to add anything to the library to cover these cases.

We want separation of stdout, stderr but also an interleaved version of that. MedallionShell only supports either of that as far as we know

Can you explain a bit more what you mean by "separation, but also interleaved"? Out of the box, with MedallionShell you can get interleaved stdout/stderr like so:

foreach (var line in command.GetOutputAndErrorLines())
{
   ...
}

If you want to consume the streams separately, you can also do that via command.StandardOutput and command.StandardInput.

MedallionShell throws a TimeoutException instead of returning false after running the process.

We only throw TimeoutException if you explicitly set a command timeout. This is so you can differentiate between a command that was killed due to explicit timeout vs. some other reason. However, if you don't want the exception you could just not set a timeout and do this:

Task.Delay(timeout).ContinueWith(state => ((Command)state).Kill(), command);

We need the stdout and stderr redirected output (additionally to the redirection) live as a "onStdoutDataReceived"-handler

The reason MedallionShell doesn't use the handler model is that that model gives you no easy wait for all handling to have finished, creating thread safety issues.

If you want to handle lines of output as they arrive, with MedallionShell you'd do this:

var standardOutputHandlerTask = Task.Run(() => 
{
    // or use command.GetOutputAndErrorLines() to handle merged output,
    // in which case we just need 1 task
    foreach (var line in command.StandardOutput.GetLines())
    {
        // do something
    }
});
var standardErrorHandlerTask = // same thing using command.StandardError
// when this task completes, we know all processing has finished
await Task.WhenAll(standardOutputHandlerTask, standardErrorHandlerTask);

.NET Core made the async handler waiting issue better because Process.WaitForExit now waits for the handlers to complete, but only if you wait for exit without a timeout. See [https://source.dot.net/#System.Diagnostics.Process/System/Diagnostics/Process.Unix.cs,208](the source here). Therefore, I prefer to have control over the processing Task so that I can guarantee completion before I move on.

drauch commented 1 year ago

We only throw TimeoutException if you explicitly set a command timeout. This is so you can differentiate between a command that was killed due to explicit timeout vs. some other reason. However, if you don't want the exception you could just not set a timeout and do this:

Yeah, we could also simply catch the TimeoutException and return a corresponding boolean, it'd be a minor change.

// or use command.GetOutputAndErrorLines() to handle merged output in which case we just need 1 task

We need both at the same time, an interleaved version AND a separate version for the same call, that's why we couldn't use MedallionShell previously.

madelson commented 1 year ago

We need both at the same time, an interleaved version AND a separate version for the same call, that's why we couldn't use MedallionShell previously.

How were you doing this with Process? For MedallionShell, I think you'd do something like this:

var outTask = Task.Run(() => 
{
    foreach (var line in command.StandardOutput.GetLines())
    {
        HandleOutLine(line);
        HandleInterleavedLine(line);
    }
});
var errorTask = Task.Run(() => 
{
    foreach (var line in command.StandardError.GetLines())
    {
        HandleErrorLine(line);
        HandleInterleavedLine(line);
    }
});
drauch commented 1 year ago

Yep, that's probably the way to go + adding TaskCreationOptions.LongRunning so we don't block thread pool threads :-)