kovidgoyal / kitty

Cross-platform, fast, feature-rich, GPU based terminal
https://sw.kovidgoyal.net/kitty/
GNU General Public License v3.0
24.31k stars 977 forks source link

kitty picks up wrong LANG on macOS #5884

Closed fhfuih closed 1 year ago

fhfuih commented 1 year ago

Describe the bug

I am a macOS user with localizations settings as:

Primary Languages:
  Simplified Chinese 
  Traditional Chinese (Hong Kong)
  English (US)
  English
Region:
  Hong Kong

locale give me

LANG="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_CTYPE="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_ALL="zh_CN.UTF-8"

But kitty debug info gives me

kitty 0.26.5 created by Kovid Goyal
Darwin ZeyuBook-34.local 22.2.0 Darwin Kernel Version 22.2.0: Fri Nov 11 02:03:51 PST 2022; root:xnu-8792.61.2~4/RELEASE_ARM64_T6000 arm64
ProductName:        macOS ProductVersion:       13.1 BuildVersion:      22C65
Frozen: True
Paths:
  kitty: /Applications/kitty.app/Contents/MacOS/kitty
  base dir: /Applications/kitty.app/Contents/Resources/kitty
  extensions dir: /Applications/kitty.app/Contents/Resources/Python/lib/kitty-extensions
  system shell: /bin/zsh
Loaded config files:
  /Users/zeyu/.config/kitty/kitty.conf

Config options different from defaults:
env:
{'LANG': 'zh_CN.UTF-8', 'LC_ALL': 'zh_CN.UTF-8'}
font_family          Cascadia Code PL
font_features:
{'CascadiaCodePL': ('+calt', '+ss01'),
 'CascadiaCodePL-BoldItalic': ('+calt', '+ss01'),
 'CascadiaCodePL-ExtraLightItalic': ('+calt', '+ss01'),
 'CascadiaCodePL-Italic': ('+calt', '+ss01'),
 'CascadiaCodePL-LightItalic': ('+calt', '+ss01'),
 'CascadiaCodePL-SemiBoldItalic': ('+calt', '+ss01'),
 'CascadiaCodePL-SemiLightItalic': ('+calt', '+ss01')}
font_size            13.0
symbol_map:
    U+23fb - U+23fe → Symbols Nerd Font
    U+2665 - U+2665 → Symbols Nerd Font
    U+26a1 - U+26a1 → Symbols Nerd Font
    U+2b58 - U+2b58 → Symbols Nerd Font
    U+e000 - U+e00a → Symbols Nerd Font
    U+e0a3 - U+e0a3 → Symbols Nerd Font
    U+e0b4 - U+e0c8 → Symbols Nerd Font
    U+e0ca - U+e0ca → Symbols Nerd Font
    U+e0cc - U+e0d4 → Symbols Nerd Font
    U+e200 - U+e2a9 → Symbols Nerd Font
    U+e300 - U+e3eb → Symbols Nerd Font
    U+e5fa - U+e631 → Symbols Nerd Font
    U+e700 - U+e7c5 → Symbols Nerd Font
    U+ea60 - U+ebeb → Symbols Nerd Font
    U+f000 - U+f2e0 → Symbols Nerd Font
    U+f300 - U+f32d → Symbols Nerd Font
    U+f400 - U+f4a9 → Symbols Nerd Font
    U+f500 - U+fd46 → Symbols Nerd Font
Added shortcuts:
    f1 →  show_kitty_env_vars
Colors:
    background           #111213   
    color0               #323232   
    color1               #c22832   
    color10              #8ec43d   
    color11              #e0c64f   
    color12              #43a5d5   
    color13              #8b57b5   
    color14              #8ec43d   
    color2               #8ec43d   
    color3               #e0c64f   
    color4               #43a5d5   
    color5               #8b57b5   
    color6               #8ec43d   
    color7               #eeeeee   
    color8               #323232   
    color9               #c22832   
    cursor               #e2be21   
    foreground           #cacecd   
    selection_background #303233   
    selection_foreground #111213   

Important environment variables seen by the kitty process:
    PATH                                /Applications/kitty.app/Contents/MacOS:/usr/bin:/bin:/usr/sbin:/sbin
    LANG                                zh_HK.UTF-8
    SHELL                               /bin/zsh
    USER                                zeyu

Note in the debug output that: I have set env LANG=zh_CN.UTF-8, but kitty stills see zh_HK.UTF-8

To Reproduce

Steps to reproduce the behavior:

  1. Set the system languages and region as above
  2. open kitty and see

Screenshots

NA

Environment details

Attached above

Additional context

NA

fhfuih commented 1 year ago

My real problem is: kitty is using a Trad.Chinese (Hong Kong) glyph variant of the Simp.Chinese font variant of the system font. I think LANG should be the cause.

kovidgoyal commented 1 year ago

LANG does nto control font fallback. That is done my CoreText. You can use --debug-font-fallback to see which font is used for a given symbol. And you can override it with symbol_map in kitty.conf.

fhfuih commented 1 year ago

Not directly, but to some extent I think it is, I will explain it below. @kovidgoyal I understand you have spent many times explaining this, but please read the detailed explanation below of my experiment:

When I open a kitty --debug-font-fallback from within a kitty term, the sub-kitty can pick up the LANG=zh_CN.UTF-8 env. (I will not post the kitty debug log here. But LANG is correctly set) Also, the parent-kitty says

U+9aa8 bold Face(family=PingFang SC, full_name=PingFang SC Semibold, postscript_name=PingFangSC-Semibold, path=/System/Library/Fonts/PingFang.ttc, units_per_em=1000, ascent=27.6, descent=8.8, leading=0.0, point_sz=0.0, scaled_point_sz=26.0, underline_position=-3.9 underline_thickness=2.1)

OK So the SC (Simplified Chinese) font variant is chosen. In return, the character glyphs are in Simp.Chinese variants:

image

But back in the parent-kitty term, directly opened from app dock, has LANG=zh_HK.UTF-8. In return, the character glyphs are in Traditional Chinese (Hong Kong) variants:

image

This documentation from Noto Sans CJK a.k.a. Source Han Sans demonstrates the difference between CJK regional difference:

image

Possible cause

I don't how to --debug-font-fallback the parent-most kitty process that's directly spawned from the app dock (and I would like to help investigate if you tell me how!). So there can be two possible causes

Recall that my localization settings are

Primary Languages:
  Simplified Chinese 
  Traditional Chinese (Hong Kong)
  English (US)
  English
Region:
  Hong Kong

1 maybe the parent kitty failed to set LANG=zh_CN.UTF-8 even if I write env LANG=zh_CN.UTF-8 in kitty.conf. Then kitty tells the system to look for a font fallback for zh_HK instead (The parent kitty sees LANG=zh_HK.UTF-8)! Then it uses PingFang HK instead of PingFang SC in this process. 2 Most modern CJK UI fonts (including PingFang) also include all regions' glyph variants inside a specific font variant. So inside PingFangSC, there exists HK,TW,JP,KR variants. These glyph variants are normally not enabled unless the current text is tagged as in those languages. Specifically in PingFing, seems they put non-default regional glyph (i.e. HK, TW, JP, KR. Default is SC in PingFangSC font variant) variants in custom cvxx features, and the feature lookup table enable one, if only necessary, according to language tag & script tag. Perhaps the parent kitty uses PingFang SC successfully. But the env LANG=zh_HK.UTF-8 triggers the HK glyph variant of the SC font variant.

I'm a newbie to anything behind OpenType, so that's the most I know. In either case, I think it may work to somehow make sure the parent-kitty pick up LANG env according to kitty.conf. (And not picking up LANG should be a kitty bug, not a feature request. Because the kitty doc explicitly invites us to specify LANG in kitty.conf if intended)

For the second case, another solution may be to allow us to configure font script_tag or language_tag, in addition to feature_tag. This is a feature request though, and is rather ad-hoc. Since another CJK font (e.g. Noto Sans CJK) may implement the regional glyph mechanism in another manner with another feature name, setting feature_tag may be cumbersome. (And I tried setting PingFangSC's feature_tag -cv08 -cv09 in kitty.conf with no luck...) As long as the script_tag or language_tag are correctly set, the font should give us the expected glyph.

kovidgoyal commented 1 year ago

Except that in terminals there is no language tagging. Fallback symbols are not queried by language, only by unicode code point. IIRC kitty uses CTFontCreateForString, see core_text.m for details. It could well be that CoreText itself reads the LANG variable for this, though I doubt it.

And env has no effect on the environment for kitty itself, since kitty has to start before it can read kitty.conf. It matters only to the environment of child processes.

On macOS if LANG is not set when kitty starts, kitty will query Cocoa to determine the lang, see the function coca_get_lang() which works by querying NSLocale for lang code and countrycode. If that's not returning the right value on your system, there's not much kitty can do about it. You can debug the env vars kitty sees here: https://sw.kovidgoyal.net/kitty/faq/#things-behave-differently-when-running-kitty-from-system-launcher-vs-from-another-terminal

fhfuih commented 1 year ago

Thanks for the info. Could you teach me how to see font fallback of the kitty instance directly opened from app dock just like --debug-font-fallback? I am not familiar with mac programming; maybe I can check the font fallback first before continue investigating. Thanks

kovidgoyal commented 1 year ago

See the kitty FAQ it tells you how to specify command line options for macOS dock launches. IIRC the stderr of such processes is available via Console.app though I am not sure about that.

fhfuih commented 1 year ago

That's strange. I self-compiled a kitty (at tag v0.26.5, the latest release, the version I'm using now) only with a few additional lines to os_log the font fallback lines. (originally printfed, which is not accessible on macOS). Now the self-compiled kitty.app works fine even if it sees env=zh_HK.UTF.8. The PingFang SC font is chosen, and the default SC glyph variants are displayed.

Can it be the build toolchain or build environment? I doubt but is there any other possible explanation for that?

kovidgoyal commented 1 year ago

On Sun, Jan 15, 2023 at 10:24:29AM -0800, Zeyu Huang wrote:

That's strange. I self-compiled a kitty (at tag v0.26.5, the latest release, the version I'm using now) only with a few additional lines to os_log the font fallback lines. (originally printfed, which is not accessible on macOS). Now the self-compiled kitty.app works fine even if it sees env=zh_HK.UTF.8. The PingFang SC font is chosen, and the default SC glyph variants are displayed.

Can it be the build toolchain or build environment? I doubt but cannot come up with another conclusion

I dont see how.