mawww / kakoune

mawww's experiment for a better code editor
http://kakoune.org
The Unlicense
9.96k stars 715 forks source link

macOS: `:kitty-terminal kak` breaks unicode characters when system language does not match system region #3768

Open hristost opened 4 years ago

hristost commented 4 years ago

Steps

  1. On MacOS, open Kakoune in a kitty window
  2. Start a new window using :new or :kitty-terminal kak

Outcome

The newly created window is not displaying unicode characters properly.

Screenshot 2020-09-28 at 15 56 12

Expected

The new window should properly display unicode characters.

The relevant part of my Kakoune config is:

require-module kitty
declare-option -docstring %{window type that kitty creates on new and repl calls (kitty|os)} str kitty_window_type os

The output of executing !locale in both windows is the same:

LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

However, I think kakoune gets this info by only asking one of the terminals.

I am thinking that there might be some environment variable that does not get passed when :kitty-terminal is called. If I start a new window using :kitty-terminal fish and then start kak from there, everything displays alright.

My locale is set by setting the environment variable in my fish.config.

Screwtapello commented 4 years ago

Executing !locale runs the locale process in the Kakoune server, not in the client inside the terminal, so it's going to give the same results no matter what client you run it in. You can investigate environment variables in the client with something like:

:echo Client $LANG: %val{client_env_LANG} Client $LC_ALL: %val{client_env_LC_ALL}

That said, when Kakoune launches a client inside a new terminal, it doesn't involve your interactive shell in any way, so anything you set in fish.config isn't going to take effect.

hristost commented 4 years ago

Thanks! Indeed, LANG and LC_ALL are not set in the client instance. Is Kakoune supposed to forward its environment variables to clients it launches?

Screwtapello commented 4 years ago

The expectation is that such variables will be set properly by the OS at login time, so they'll be available to all programs automatically, including graphical apps and text-based apps launched directly inside a graphical terminal like Terminal, iTerm2 or Kitty.

I don't have access to a macOS system myself, but I asked a Mac-using friend who hadn't modified any environment variables to run locale and they reported sensible results. I suspect some other configuration on your computer has removed or broken the locale-setting environment variables, and your modifications to fish.config are only fixing a symptom, not the underlying problem.

hristost commented 4 years ago

That makes sense. Setting the language in fish.config is something I had added in the process of debugging Kakoune, but now I did some more debugging without the addition.

I did some testing and came to the conclusion that this can happen that when the system language and the region are different. (Check System Preferences app -> Language and Region -> General tab)

When language and locale are different, running locale produces the following output in all of kitty, iTerm 2, and Terminal.app:

LANG=""
LC_COLLATE="C"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

As before, Kakoune ran alright in the first terminal window (this is without fish setting the language) but :new could not handle UTF characters.

Running the command @Screwtapello suggested:

:echo Client $LANG: %val{client_env_LANG} Client $LC_ALL: %val{client_env_LC_ALL}

revealed that neither Kakoune instance has $LANG or $LC_ALL set, yet one of them seemed to handle unicode nicely.

However, when I changed my system language and region to match, locale produced more meaningful values and everything ran ok after restarting. It's an easy fix, but it's also somewhat unfortunate for people who want to use one language but a different region.

I'm thinking that this is something worth investigating, but I'm unsure if it has something to do with MacOS not setting environment correctly, fish not inferring the locale correctly, or Kakoune not being robust enough with different locales. Clearly it runs alright with the locale pasted above, it's just that windows created with :new do not fare as well.

Screwtapello commented 4 years ago

Internationalisation and localisation are complex topics, and modern operating systems have complex and flexible language and region options to suit their users' needs. However, the locale system for command-line applications was made an international standard in 1990, and cannot easily be changed to match the rest of the OS. I'm not really suprised that the locale command has meaningful values in a simple configuration but falls back on awkward defaults when things get more complex — that's Apple trying to flatten the macOS language-and-region settings into something that the 1990 locale API can handle.

A web search reveals that LC_CTYPE "selects the character classification category", which sounds like the thing we're interested in. Normally the system would set LANG and LC_CTYPE would default to the same thing, but apparently something is leaving $LANG empty and only setting $LC_CTYPE.

I think this issue is also relevant: https://github.com/kovidgoyal/kitty/issues/1233 although it was supposedly fixed in June 2020 so maybe this is something different.

vbauerster commented 3 years ago

Kitty supports --copy-env flag, try kitty @ launch --copy-env kak. Maybe it should be included in default kitty.kak for better out of the box experience for macOS users? Of course if it doesn't break on other systems.