Closed aetonsi closed 1 year ago
If you run your php script in a command shell and pipe the stdout/stderr to separate files, does it maintain order?
yeah you can try it yourself if you want, i simply used php as a simple means of having irregularly alternating output and error data. You can create a simple php script with totally random alternating stdout/stderr like this (pwsh):
'<?php ' > .\script.php
1..10 | ForEach-Object {
$stream = @('STDOUT', 'STDERR') | Get-Random
$text = "$("$stream".Substring(3))$_"
"fwrite($stream, '$text ');" >>script.php
}
That generates a script.php
like this:
<?php
fwrite(STDOUT, 'OUT1 ');
fwrite(STDERR, 'ERR2 ');
fwrite(STDERR, 'ERR3 ');
fwrite(STDERR, 'ERR4 ');
fwrite(STDOUT, 'OUT5 ');
fwrite(STDERR, 'ERR6 ');
fwrite(STDOUT, 'OUT7 ');
fwrite(STDOUT, 'OUT8 ');
fwrite(STDERR, 'ERR9 ');
fwrite(STDOUT, 'OUT10 ');
And if you invoke the same script 10 times the output is always the same (note the >>stdout.log 2>>stderr.log
splitting the output):
1..10 | ForEach-Object {
& php -f script.php >>stdout.log 2>>stderr.log
}
it generates the following files:
Problems arise only when using Cliwrap in c#.
I am using the | ( t, t )
to pipe to the same FileStream
object, shouldn't this work? Or is it wrong?
I am using the
| ( t, t )
to pipe to the sameFileStream
object, shouldn't this work? Or is it wrong?
That is correct. However, the stream implementation itself may not be thread-safe. You can try using Stream.Synchronized(...)
although I think that only affects sync methods and probably won't help much in this case anyway.
yes as you guessed using a synchronized stream doesn't make a difference:
Stream ss = Stream.Synchronized(fs);
PipeTarget t = PipeTarget.Merge(new[] {
PipeTarget.ToStream(ss)
});
same result.
The stringbuilder is even worse, it loses pieces of output down the road:
The interesting thing is, PipeTarget.ToFile(...)
has the correct order and data is not broken up. It is simply missing the entire STDERR data:
Okay, so your last comment suggests more to me that it might be a thread-safety issue. Try this:
using var semaphore = new SemaphoreSlim(1, 1);
var stringBuilder = new StringBuilder();
var target = PipeTarget.ToDelegate(async line =>
{
await semaphore.WaitAsync(cancellationToken);
try
{
stringBuilder.AppendLine(line);
}
finally
{
semaphore.Release();
}
});
var cmd = Cli.Wrap(...) | (target, target);
Haven't tested the above code myself so please review it.
I came back to investigate a bit more and reproduced the original scenario with the following cmd
script:
@echo off
setlocal EnableDelayedExpansion
for /L %%i in (1,1,100) do (
set /A "remainder=%%i %% 2"
if !remainder! == 0 (
echo ERR %%i 1>&2
) else (
echo OUT %%i
)
)
Running the following maintains order:
$ test.bat > out.txt 2>&1
The following CliWrap code does not maintain order:
var sb = new StringBuilder();
using var fs = File.Create("out.txt");
var target = PipeTarget.Merge(
PipeTarget.ToStream(Console.OpenStandardOutput()),
PipeTarget.ToStringBuilder(sb),
PipeTarget.ToStream(fs)
);
await (Cli.Wrap("test.bat") | (target, target))
.ExecuteAsync();
The following CliWrap code does maintain order:
var sb = new StringBuilder();
using var fs = File.Create("out.txt");
using var semaphore = new SemaphoreSlim(1, 1);
var target = PipeTarget.ToDelegate(async line =>
{
await semaphore.WaitAsync();
try
{
Console.WriteLine(line);
sb.AppendLine(line);
await fs.WriteAsync(Encoding.UTF8.GetBytes(line));
}
finally
{
semaphore.Release();
}
});
await (Cli.Wrap("test.bat") | (target, target))
.ExecuteAsync();
I'm not sure how cmd
or PowerShell do it on their end, but I assume they have a way to differentiate between discrete write operations, which allows them to synchronize them even if the outputs are not separated by line breaks (or any other predefined character sequence).
On CliWrap's side, stdout and stderr are just homogenous binary streams, so it's impossible (to my knowldge) to reliably identify individual writes. The ToDelegate(...)
approach above identifies them by line breaks.
For completeness’s sake, I tested it out with Process
too and it also doesn't ensure the order of writes:
using var process = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = "test.bat",
RedirectStandardOutput = true,
RedirectStandardError = true
},
EnableRaisingEvents = true
};
process.OutputDataReceived += (sender, args) =>
{
if (args.Data is not null)
{
Console.WriteLine(args.Data);
}
};
process.ErrorDataReceived += (sender, args) =>
{
if (args.Data is not null)
{
Console.WriteLine(args.Data);
}
};
process.Start();
process.BeginOutputReadLine();
process.BeginErrorReadLine();
process.WaitForExit();
Unfortunately, it seems that with a larger sequence of writes, even my second method doesn't work. I think it may be impossible to achieve this directly.
However, one trick you can do is wrap the shell and merge stdout and stderr:
Cli.Wrap("cmd").WithArguments(new[] {"/c", "test.bat 2>&1"})
Then you can process the output normally in CliWrap, since it's now just one stream:
var sb = new StringBuilder();
using var fs = File.Create("out.txt");
var target = PipeTarget.Merge(
PipeTarget.ToStream(Console.OpenStandardOutput()),
PipeTarget.ToStringBuilder(sb),
PipeTarget.ToStream(fs)
);
await (Cli.Wrap("cmd").WithArguments(new[] {"/c", "test.bat 2>&1"}) | target)
.ExecuteAsync();
Sorry i didn't reply for weeks, i literally had the tab open but too many things to do.. You mean to say it's totally impossible (as far as you understand) to maintain output order with c#? Or with any .NET language maybe? At which level do things break in your opinion? I've always had problems myself with this thing in c# and always assumed i was doing something wrong since i'm not a c# dev. But if it's totally impossible it might be useful to report it to someone (to whom, i don't know, it depends)
tyfyt
You mean to say it's totally impossible (as far as you understand) to maintain output order with c#? Or with any .NET language maybe? At which level do things break in your opinion?
It's definitely impossible using .NET's Process
class (which CliWrap is based on), from my understanding. If you make your own native calls to create and run a process, then it should be theoretically possible on Windows, I believe. I don't know about other operating systems, though.
This is not specific to C# however, as you can find people facing the same challenges in other languages and platforms:
Many of the answers above reference "buffering" as being the cause. That's probably true at the system level, but even if the client consistently flushed data after every write, the order of data could still get mangled by the layers of indirection that it has to go through before it reaches your code. In .NET, for example, you can theoretically do a Task.WhenAny(...)
loop on stdout/stderr reads, but that would only preserve order assuming it's guaranteed in all the underlying stream plumbing (which it probably is not).
I think the main barrier is that the standard streams were not meant to be synchronized from a conceptual standpoint, so any successful attempts to achieve that may just be coincidental.
Yes i saw a few of those exact threads you linked.. i did find some solution for linux a long time ago, can't remember what exactly, that altered or even disabled the console's buffering and allowed for ordered output. I can't remember if i even tested it though, as i focus mainly on Windows atm. Still, there has to be a way to like, "hook deeper" into the system's calls, right below where said buffering occurs.. but how? how do cmd/pwsh do it for example? but this is more a question for some stack* website i think.... It's bittersweet to know that me being a c# newbie wasn't the problem :\
Version
3.6.0
Details
I was trying to output the entire invocation's output (so both stdout and stderr) to a file, but the order of output and error data is always messed up. AFAIK, maintaining the output's correct order is a common problem with command line programs, due to output buffering or something...
Steps to reproduce
As test, i used a simple
php
invocation (all the following scripts are run in powershell):But if i try to invoke that in a C# app via Cliwrap, everything i try messes up the output in some way. I tried outputting to the console output stream, to a
StringBuilder
, to a file. Nothing works correctly. Here's a simple c# test program:And here's a powershell script to compare the correct native output, to Cliwrap's tries:
o111 E222 E333 o444
E222 o111 E333 o444
o111 o444 E222 E333
o111 E222 E333 o444
o111 E222 E333 o444
E222 o111 E333 o444
o111 E222 E333 o444
o111 E222 o444 E333
o111 E222 E333 o444
E222 o111 E333 o444
o111 o444 E222 E333
o111 E222 o444 E333
o111 E222 E333 o444
o111 E222 o444 E333
E222 E333 o111 o444
E222 o111 E333 o444
o111 E222 E333 o444
E222 o111 E333 o444
E222 E333 o111 o444
o111 E222 o444 E333
o111 E222 E333 o444
o111 E222 o444 E333
o111 E222 o444 E333
o111 E222 o444 E333
o111 E222 E333 o444
o111 E222 o444 E333
E222 o111 E33 o444
E222 o111 E333 o444
o111 E222 E333 o444
o111 E222 o444 E333
o111 o444 E222 E333
o111 E222 o444 E333
o111 E222 E333 o444
E222 E333 o111 o444
o111 E222 E333 o444
E222 o111 E333 o444
o111 E222 E333 o444
E222 o111 E333 o444
o111 E222 E33 o444
E222 o111 E333 o444
As you can see, when directly invoking php, the output is always in correct order. When piped via Cliwrap, it's always randomly out of order, and sometimes it's even partially truncated (
E33
instead ofE333
on a couple of lines).Is there any way to fix this ordering problem, or any way to pipe/output everything together and have it in the correct order?
thank you have a nice day