Closed mtaku3 closed 1 year ago
Hey @mtaku3.
I was able to reproduce the discrepancy with your help, thank you. I'm now trying to figure out whether this is a bug and how to address it.
Furthermore, I tried to replicate your scenario in CliWrap tests (which are running against a .NET executable instead of a batch file) and it didn't work. It seems that ProcessStartInfo.StandardOutputEncoding
does not have any effect on certain type of programs – or at least .NET console applications.
I also tried to dig through the documentation to see whether this is an edge case or an OS-specific behavior but I was not able to find any official information regarding this scenario. The docs you linked (https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.processstartinfo.standardoutputencoding?view=net-7.0) don't provide a lot of useful information beyond this remark:
Setting this property does not guarantee that the process will use the specified encoding. The application should be tested to determine which encodings the process supports.
Question: does your original use case also involve a batch file? If not, what kind of program is it?
As an immediate workaround, you can use this extension method:
public static Command WithChcpWrapper(this Command command, Encoding encoding)
{
return Cli.Wrap("cmd")
.WithArguments(a => a
.Add("/c")
.Add(
new ArgumentsBuilder()
.Add("chcp")
.Add(encoding.CodePage)
.Add(">nul")
.Add("&&")
.Add(command.TargetFilePath)
.Add(command.Arguments, false)
.Build(),
false
)
)
.WithWorkingDirectory(command.WorkingDirPath)
.WithEnvironmentVariables(command.EnvironmentVariables)
.WithCredentials(command.Credentials)
.WithStandardInputPipe(command.StandardInputPipe)
.WithStandardOutputPipe(command.StandardOutputPipe)
.WithStandardErrorPipe(command.StandardErrorPipe)
.WithValidation(command.Validation);
}
It wraps your existing command in cmd
and sets the encoding within that session. You can use it like so:
private static async Task CliWrapImpl()
{
await Cli.Wrap("echo.bat")
.WithStandardOutputPipe(PipeTarget.ToDelegate(Console.WriteLine, Encoding.UTF8))
.WithChcpWrapper(Encoding.UTF8)
.ExecuteBufferedAsync();
}
Note that I removed ./
from the path because cmd
trips up on paths starting with .
unless they're quoted, and CliWrap doesn't quote .
because it's not considered a special character. You may want to tweak it a bit.
Actually, after further testing, it seems that even making this change is enough to get it working. Can you test it out @mtaku3?
private static async Task CliWrapImpl()
{
await Cli.Wrap("./echo.bat")
- .WithStandardOutputPipe(PipeTarget.ToDelegate(Console.WriteLine))
+ .WithStandardOutputPipe(PipeTarget.ToDelegate(Console.WriteLine, Encoding.UTF8))
.ExecuteAsync();
}
I was in a wrong way. You are right! I was doing like this
Cli.Wrap("./example.exe")
.WithStandardOutputPipe(PipeTarget.ToDelegate(Console.WriteLine))
.Observe(Encoding.UTF8, Encoding.UTF8, forciblyCloseCTS.Token, gracefullyCloseCTS.Token)
.Subscribe();
But I didn't notice that encoding parameters on Observe() has an effect only on the observable, which will be created by Observe() and it doesn't have an effect on Pipe which is merged by WithStandardOutputPipe(). Providing Encoding.UTF8 in WithStandardOutputPie solved my issue.
For your reference, I was trying to run a process made of Go and is using log package for logging. log package doesn't have a feature to set the encoding and seems to depend on the console.
Also, I tested with another batch file, which outputs the current code page by chcp
and run it on CliWrapImpl and DotNetImpl. Both of it outputs the same encoding and looks like both of it doesn't have an effect on the console's encoding. So to change the console's encoding, I have to use something like your WithChcpWrapper
.
Thank you.
Version
ver 3.6.0
Details
Some executables varies its output encoding based on the console's default encoding. CliWrap offers a feature to change an encoding of the standard output stream reader. But it never changes the console's default encoding, so there's possibility that the characters being garbled. .NET's Process class can change the console's default encoding at
ProcessStartInfo.StandardOutputEncoding
. Reproduction code tries to echo a characterϧ
(U+03E7).CliWrapImpl
will read the output in console's default encoding, so that the character will be garbled.DotNetImpl
will read the output in the UTF-8 encoding because it's specified atProcessStartInfo.StandardOutputEncoding
, so that the character will not be garbled. .NET Console Application's default encoding is the same as the console's default encoding so that the character may still be garbled. (Console.OutputEncoding
must be the same aschcp
on cmd) But you can find thatProcessStartInfo.StandardOutputEncoding
can change the console's default encoding on the process and it prevents the characters being consumed in the wrong encoding.Steps to reproduce
Reproduction code is here. Just clone, build and run it. One thing to notice is that console's default encoding varies based on the system locale according to a Microsoft's arcticle. So the behavior may vary as well. Please let me know if you couldn't reproduce this.