Open segevfiner opened 2 years ago
So digging in a bit, setting [Console]::OutputEncoding = [UnicodeEncoding]::new([BitConverter]::IsLittleEndian, $false)
doesn't actually call SetConsoleOutputCP
but just sets the console object to write to the console using WriteConsoleW
:
dotnet/runtime/src/libraries/System.Console/src/System/ConsolePal.Windows.cs:121-128
So might actually be safe when supported by the .NET runtime, at least for native programs... But who knows about other stuff in .NET that might use it.
Alternatively we can just call WriteConsoleW
directly (Probably changing the name and namespace for the add-type below, and/or combining them to a single class):
$sig = @'
[DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
public static extern bool WriteConsole(IntPtr hConsoleOutput, string lpBuffer,
uint nNumberOfCharsToWrite, out uint lpNumberOfCharsWritten,
IntPtr lpReserved);
'@
$WriteConsole = Add-Type -MemberDefinition $sig -Name "Win32WriteConsole" -Namespace Win32Functions -PassThru
$sig = @'
[DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
public static extern IntPtr GetStdHandle(int nStdHandle);
'@
$GetStdHandle = Add-Type -MemberDefinition $sig -Name "Win32GetStdHandle" -Namespace Win32Functions -PassThru
$a = prompt
$WriteConsole::WriteConsole($GetStdHandle::GetStdHandle(-11), $a, $a.Length, [ref]$null, [IntPtr]::Zero) # Not sure [ref]$null is the right way to discard out parameters in PowerShell
Basically you can write UTF-16 to the console regardless of the console output CP, which only affects ANSI functions and std streams, using WriteConsoleW
. (This is what Python's WindowsConsoleIO
hack does too)
Powershell itself seems to do something similar: https://github.com/PowerShell/PowerShell/blob/2f57bf848b03828ee6c343b55f7ce80df2e5a23e/src/Microsoft.PowerShell.ConsoleHost/host/msh/ConsoleHost.cs#L2502 which ends up using https://github.com/PowerShell/PowerShell/blob/2f57bf848b03828ee6c343b55f7ce80df2e5a23e/src/Microsoft.PowerShell.ConsoleHost/host/msh/ConsoleTextWriter.cs#L12 as far as I can tell.
Well doing $prompt | Out-Host
also works (Since it goes through the PowerShell console output machinery).
OK. So it looks like PSReadLine is already setting Console.OutputEncoding
but it resets it before calling external commands. So what we need to do is simply set it again in InvokePrompt
and just restore the previous value afterwards.
See https://github.com/kelleyma49/PSFzf/issues/71#issuecomment-961148891
@segevfiner, thanks for your follow-up investigation on the issue!
If I understand it correctly, it's because the prompt
function calls a native command, which returns a Unicode string, but [Console]::OutputEncoding
is not UTF8, and that causes the returned string to become garbled. Is that understanding correct?
Can you please share your prompt function? A simple prompt function that can reproduce the problem would be very helpful, thanks!
@segevfiner, thanks for your follow-up investigation on the issue! If I understand it correctly, it's because the
prompt
function calls a native command, which returns a Unicode string, but[Console]::OutputEncoding
is not UTF8, and that causes the returned string to become garbled. Is that understanding correct?Can you please share your prompt function? A simple prompt function that can reproduce the problem would be very helpful, thanks!
Yes. It's because PSReadline is using the Console
object for output and changing [Console]::OutputEncoding
when it runs, but it doesn't do so when one of its functions is called from the outside. Powershell itself has its own console output machinery that bypasses this. As in, it doesn't seem to be using the Console
object.
I'm using [starship[(https://github.com/starship/starship) and had it triggered by PSFzf, but for a simple one, just take the default prompt and stick non BMP Unicode in it:
function prompt { "PS $($executionContext.SessionState.Path.CurrentLocation)$('❯' * ($nestedPromptLevel + 1)) " }
Also note https://github.com/kelleyma49/PSFzf/issues/71#issuecomment-961148891 where I posted a workaround that PSFzf is going to incorporate for this.
Setting the outputEncoding to UTF8 (with no BOM) seems to resolve this:
[Console]::OutputEncoding = $OutputEncoding = [Console]::InputEncoding = [System.Text.UTF8Encoding]::new()
To me, it feels crazy how it sometimes works right and sometimes I get ? for every extended character:
As people have said above, if you don't have the console encoding set to UTF8, then when PSReadLine attempts to copy your prompt and change the color to show a parse error (e.g. in RenderErrorPrompt) it doesn't get a proper copy.
Set-PSReadLineOption
and explicitly set -PromptText
to an array of two strings (normal, and error) so that PSReadLine can stop guessing what text to use.A better workaround is probably for PowerShell or PSReadLine to set the console encoding to UTF-8, either all the time, or explicitly when trying to read from it.
PSReadLine could just stop trying to read the prompt from the screen. If the user configures PromptText, then use that. Otherwise, do nothing to the prompt, since you can't be sure you're not going to break it, and this feature isn't worth the risk.
Here another instance of this bug messing with a prompt:
https://user-images.githubusercontent.com/1787673/210333851-f1055e78-1ec1-4636-a925-3cbd3c0a9f97.mp4
(from https://github.com/JanDeDobbeleer/oh-my-posh/issues/3298)
This function was not tested for wide public use and is used specifically for some PSReadLine functionality so likely many bugs with it, marking as a bug.
Environment
Exception report
N/A
Steps to reproduce
[Microsoft.PowerShell.PSConsoleReadLine]::InvokePrompt()
This function is used by other projects, such as PSFzf to re-render the prompt in cases where this is necessary.
Expected behavior
The prompt is rendered correctly:
Actual behavior
The prompt is rendered incorrectly:
Analysis
It appears the
InvokePrompt
is usingGetPrompt
, buffering the prompt to a string and then writing it to the console by itself. But because the default[Console]::OutputEncoding
is not UTF8, this breaks, which the prompt function handles when it gets to write to the console directly by itself under normal circumstances.A workaround can be to set
[Console]::OutputEncoding
to[Text.Encoding]::UTF8
, which does make this work, yet I'm unsure what side effects this might have on other stuff in PowerShell that will try to output to the console, or maybe that should have been the default but isn't for some reason?If this shouldn't be changed, then maybe PSReadline should set this temporarily while printing the prompt to the console? Or alternatively, re-implement
InvokePrompt
in a way that won't require it to buffer the prompt string.References
https://github.com/kelleyma49/PSFzf/issues/71