microsoft / winget-cli

WinGet is the Windows Package Manager. This project includes a CLI (Command Line Interface), PowerShell modules, and a COM (Component Object Model) API (Application Programming Interface).
https://learn.microsoft.com/windows/package-manager/
MIT License
22.64k stars 1.4k forks source link

CJK font not showing in powershell because of changing to UTF-8 & Consolas #1832

Open SubaruArai opened 2 years ago

SubaruArai commented 2 years ago

Brief description of your issue

When using windows 10 in CJK, (tested in Japanese, but pretty sure it affects CJK) the default encoding isn't UTF-8. sjis1[^1] [^1]: In this case, it's SJIS

When using winget, it temporary changes the encoding to UTF-8, and the font to Consolas. Since Consolas doesn't have CJK characters, nothing is readable. utf82

As a sidenote, finishing winget will revert the encoding and fonts back to default. sjis2

Steps to reproduce

  1. Prepare a fresh install of windows10, in CJK languages.
  2. Install winget through store or github.
  3. Open powershell and enter any winget command. e.g. winget search terminal

Expected behavior

The encoding can change, but the font should not change, or at least use rasterized font (ugly but readable). If possible, Consolas+fallback font (default font) would be really nice.

Actual behavior

The encoding and font both changes, causing CJK characters not able to render. Nothing (including the agreement prompt about terms of transaction!) can be read.

Environment

Windows Package Manager v1.1.13405
Copyright (c) Microsoft Corporation. All rights reserved.
Windows: Windows.Desktop v10.0.19044.1415
パッケージ: Microsoft.DesktopAppInstaller v1.16.13405.0
SubaruArai commented 2 years ago

Just to add some background:

[^1]: yes, windows terminal is great, but most of the time you'll be installing it using winget first.

denelon commented 2 years ago

This sounds like it may be a bug with Windows Terminal.

SubaruArai commented 2 years ago

@denelon No, this is a bug (or feature) with Powershell, and Windows Terminal has nothing to do with it. The point is: silently changing the encoding while running a program might be a bad idea, especially when this program contains terms of usage. If changing, it needs to ensure that the font using will display all the characters needed. (in this case, maybe by not changing the font?)

Just thinking: If changing font from the program isn't possible, maybe add a conversion layer from the terminal's encoding to UTF-8? Though some characters will not be convertable from UTF-8, and that'll be another problem...

Edit: confirmed that windows terminal doesn't have this issue, deleted lines mentioning it.

Masamune3210 commented 2 years ago

So are you saying that Windows Terminal doesn't show the issue or that it does? If it does, you need to file a bug with them, if it doesn't then why not just use that instead of the default. It's the terminals job to handle codepage changes

SubaruArai commented 2 years ago

I'm saying that this is NOT related to Windows Terminal. It's an issue with PowerShell.

Let me be clear:

  1. AFAIK, Windows 10 ships with PowerShell as its default terminal emulator.
  2. PowerShell doesn't handle fallback fonts
  3. Windows 10 still ships without defaulting to UTF-8 in CJK countries.
  4. winget will change the encoding silently when running to UTF-8
  5. Combine those, and you've got a nice unusable application (winget) with default settings.

You might argue that this is PowerShell's fault for not handling properly codepage changes, or the user should change the fonts manually. But since that's the default terminal on the targeted platform (windows), it should work out of the box, no matter which locale the system is set to.

SubaruArai commented 2 years ago

Here's a SO thread about detecting and changing used fonts in powershell: link Since the link is broken, here's the waybackmachine to the cmdlet: link

I agree that working with multiple encodings is a PITA, so I suggest to change the locale, but prevent powershell to change the font.

SubaruArai commented 2 years ago

I checked with windows terminal and confirmed that this issue is not present. Changing all previous posts mentioning about windows terminal.

jedieaston commented 2 years ago

It looks like this isn't a winget bug: https://docs.microsoft.com/en-us/troubleshoot/windows-server/system-management-components/powershell-console-characters-garbled-for-cjk-languages

Launching cmd.exe and launching a PowerShell from there makes the issue go away, so this must be the bug.

SubaruArai commented 2 years ago

@jedieaston Thanks for the info! I've never heard of that, but indeed the shortcut was hardcoded to use Consolas.

So I tried to run directly (workaraound1 from above)... aand it didn't solve the issue. Now it uses Lucida Console (no CJK) for the font with UTF-8. It's interesting how windows just doesn't work™ out of the box! Using workaround2 mentioned above solved this issue, obviously.

to sum it up:

workaround No. encoding at session start font when changed to UTF-8
none non-unicode (S-JIS in Japan) Consolas(non-CJK)
1 same as above Lucida Console(non-CJK)
2 same as above any font(CJK compatible)

I don't know how or if the winget team wants to address this issue, but I guess it's more of a policy problem rather than a technical one at this point, since microsoft abandoned to fix powershell.lnk in windows10.

I'd like to ask to the winget team: do you think this is something that should be fixed on the winget side?

Trenly commented 2 years ago

I wonder if the font could be a setting in the visual settings for winget. That would allow users to specify any font they want to use, and if not specified (or not a valid font) then the terminal default could be used

SubaruArai commented 2 years ago

@Trenly While that would be nice, but I think that sould be in another issue. The root cause here is the terminal default itself. (half of the problem, to be percise)

The terminal defaults are fine with default encoding, but since winget changes the encodint to UTF-8, problem arises. Here's a table to make it clear:

way to execute default encoding default font on default encoding encoding while running winget default font on UTF-8
powershell.lnk S-JIS MSゴシック UTF-8 Consolas (non-CJK)
powershell.exe S-JIS MSゴシック UTF-8 Lucida Console (non-CJK)

Note: values are on Windows10, Japanese

But as @jedieaston pointed out, this default font problem is an issue with powershell that microsoft admitted it won't fix.

jedieaston commented 2 years ago

I don’t think there is a VT sequence (https://docs.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences) to change the console font programmatically (or even see what it is). There may be a way to use the older API (that the docs say “please don’t use”) but I feel like there could be unintended side effects (ctrl+c could leave the terminal with the wrong font, for instance). And I don’t know if Windows Terminal is planning on supporting those APIs forever.

My personal choice would be to put this in the docs along with the explanation that using cmd.exe or Windows Terminal will solve the problem. The other thing we could do is have winget —info try to detect if the user is using a CJK locale, running powershell.exe, and on a build < 22000. In that case we could print a special message like “Experiencing garbled or missing characters when running winget? Read this: “ with a link to that docs page to help people out. We’d just have to make sure that’s printed with characters people can read ;)

Does anyone know if there’s a compatibility reason this isn’t being patched in 10/2019? It seems like a big bug.

denelon commented 2 years ago

I've contacted a few of the other teams internally to understand what options we have. Once I am able to pull all the options together, we will discuss the best way to proceed here.

denelon commented 1 month ago

Hey all, it's been a while since this was reported. Is this still a problem? I've got the right contacts now to map the issue to the proper project/team if it's still a problem.

SubaruArai commented 1 month ago

@denelon I tried breifly in a sandbox, (win10, JP) and it is now all displayed in English instead of Japanese. (Another bug?) I ain't sure if it's a windows sandbox compatibility issue, so I'll try later in a fresh VM.

SubaruArai commented 1 month ago

@denelon Yes, it's still happening. image

After cancelling with ctrl-c: image

winget --info output:

image

Windows Package Manager v1.7.11261

Windows: Windows.Desktop v10.0.19045.4529
System Architecture: X64
Package: Microsoft.DesktopAppInstaller v1.22.11261.0

The font also changed very briefly while the info was being printed out.