Open brantb opened 7 years ago
Google suggests it's a byte order mark.
help
is only showing what is in the help text. This issue should be opened here: https://github.com/powershell/powershell-docs to use utf8-noBOM
I'm just clarifying, but is this truly an issue with the help text if this issue only surfaces in vscode-powershell's Visual Studio Code Host
and not in any other host like ConsoleHost
?
One of the changes that was made to PSIC (presumably to fix another issue) was that the output encoding was changed to UTF8 from the default. That might explain why it behaves differently than the regular PowerShell terminal.
@brantb you'll see the same behavior on Linux/macOS with PowerShell Core 6 on the console. Windows understands the BOM and doesn't show it.
Confirming I still see this only on the PowerShell Integrated Console.
Closing as resolved as we have now documented how to configure encoding for PowerShell in Vscode: https://docs.microsoft.com/en-us/powershell/scripting/components/vscode/understanding-file-encoding?view=powershell-6
I looked at that doc, and I may be missing something but it doesn't seem to prevent the integrated console from displaying the BOM character?
@dsolodow I reviewed the issue and you are correct, I am re-opening this!
We noticed that this issue only occurs with Help
and not with Get-Help
the difference being that Help initially displays a smaller result with --More--
which comes from more.com
so this may be what is causing the encoding issue.
For reference the the UTF-8 BOM is 0xEF 0xBB 0xBF
. When interpreted with code page 437 (AKA DOS Latin US) it resolves as the ASCII bow drawing characters 
.
My current suspicion is that the integrated console is resolving more.com
for help, which can't understand UTF-8.
@TylerLeonhardt, @rjmholt , and I looked at this. It appears to be a combination of the extension setting [console]::OutputEncoding to UTF8 (w/ BOM) and use of 437 code page. This results in a BOM being written and a codepage that renders it. I believe @TylerLeonhardt is working on a proposed fix.
... possibly. I need to speak to the vscode folks to see if they have any ideas.
My thinking is that [System.Console]::OutputEncoding
is somehow related to the chcp
output...
In pwsh.exe, [System.Console]::OutputEncoding
is set to Code Page 437 (on Windows).
In the extension, we overwrite this:
[console]::OutputEncoding = [Encoding]::UTF8
Which is why we're seeing the BOM in the PowerShell Integrated Console...
However, if we don't do that... then the PowerShell Integrated Console can no longer render non-ascii characters like Chinese characters and the like.
That's why we originally overwrote the [Console]::OutputEncoding
... but that was probably not the right approach. There should be a way to not see the BOM but also see non-ASCII characters... just like what the non-Integrated Console shows.
I'll quote @rjmholt on this... "Encoding is a tar pit" 😅
But you can set OutputEncoding to UTF8 NoBOM.
@TylerLeonhardt I ran into a similar issue surrounding more.com
and the like recently. The encoding issue I ran into was not resolved until I fixed both the codepage using chcp.com
and $OutputEncoding
.
I could not replicate it on the latest Win10 build, however, just Win7.
@dsolodow I reviewed the issue and you are correct, I am re-opening this!
We noticed that this issue only occurs with
Help
and not withGet-Help
the difference being that Help initially displays a smaller result with--More--
which comes frommore.com
so this may be what is causing the encoding issue.
I can confirm that this is still the same. All beit that it shows:
help Test-Date -Examples
´╗┐
NAME
Test-Date
ALIASES
None
REMARKS
None
get-help Test-Date -Examples
NAME
Test-Date
ALIASES
None
REMARKS
None
To add some extra:
chcp.com
Active code page: 850
[System.Console]::OutputEncoding
Preamble :
BodyName : utf-8
EncodingName : Unicode (UTF-8)
HeaderName : utf-8
WebName : utf-8
WindowsCodePage : 1200
IsBrowserDisplay : True
IsBrowserSave : True
IsMailNewsDisplay : True
IsMailNewsSave : True
IsSingleByte : False
EncoderFallback : System.Text.EncoderReplacementFallback
DecoderFallback : System.Text.DecoderReplacementFallback
IsReadOnly : False
CodePage : 65001
Than do the following:
chcp.com 437
And ´╗┐ will change into ∩╗┐
A workarround for me:
Set-Alias -Name help -Value Get-Help
Than the original help from DOS will not come into play.
System Details
$PSVersionTable
:Issue Description
When I use the
help
function, a few garbage unicode characters are written to the output stream. UsingGet-Help
instead ofhelp
works as expected. This only happens in the extension-provided Powershell Integrated Console, not the default vscode console.On my system, the definition of the
help
function is:... and
more
is defined as ...Attached Logs
1510862054-d0c22b7a-9fe8-4400-9e12-9cb2d4fd6b5a1510862041231.zip