gyscos / cursive

A Text User Interface library for the Rust programming language
MIT License
4.25k stars 244 forks source link

Cursive prints garbage on non-utf8 locale #13

Open gyscos opened 9 years ago

gyscos commented 9 years ago

Currently cursive uses utf8 to print dialog borders. This means when running with non-unicode locale, it prints garbage. Ncurses should be able to print the correct characters no matter the locale (nmtui for instance works with any locale). Not sure how to do it though...

florommel commented 7 years ago

Ncurses has the ACS commands to get these "extended" characters for every encoding (doing some detection magic). [0] [1] Getting these characters is probably more difficult for other backends, I don't know.

gyscos commented 7 years ago

I was using the ACS commands initially, but it wasn't working great (in retrospect, it may have been because of ncurses's type mismatch that got fixed since).

Now, supporting the different backends is the main difficulty with this approach.

One solution would be to move the box-drawing behaviour to the backend interface. It's not as clean as I'd like, but it's not that bad either, so it could be an option.

An alternative would be to do the magic part independently of ncurses, and applying the result to any backend.

florommel commented 7 years ago

Doing the magic independently from the backend would be nice but the ncurses magic seems to be quite complex [0] [1] and dependent on low-level stuff.

jlpoolen commented 4 years ago

The sample code is Tutorial 2/3 demonstrates the problem.

Running the Tutorial 2/3 code in a PuTTY (Release 0.73 Build platform: 64-bit x86 Windows) shell from Windows using these Translation settings: putty_2020-05-29_09-28-48 produces in the top window of the tmux session: putty_2020-05-29_09-31-57

Note: the bottom tmux window demonstrates that PuTTY and ncurses can work together. The code running in the bottom window is: ncurses then "f" - "Display ACS characters" from the project "ncurses-examples" at https://invisible-island.net/ncurses/ncurses-examples.html

Here's a screenshot of the same code running through an ssh connection from another Gentoo Linux laptop: Screenshot from 2020-05-29 09-40-35

jlpoolen commented 4 years ago

It turns out I did not have a locale specified on my machine (ares). In Gentoo, the available locales are listed in /etc/local.gen. Here was my version during the time I had the garbled characters:

ares /home/jlpoole # cat /etc/locale.gen
# /etc/locale.gen: list all of the locales you want to have on your system.
# See the locale.gen(5) man page for more details.
#
# The format of each line:
# <locale name> <charset>
#
# Where <locale name> starts with a name as found in /usr/share/i18n/locales/.
# It must be unique in the file as it is used as the key to locale variables.
# For non-default encodings, the <charset> is typically appended.
#
# Where <charset> is a charset located in /usr/share/i18n/charmaps/ (sans any
# suffix like ".gz").
#
# All blank lines and lines starting with # are ignored.
#
# For the default list of supported combinations, see the file:
# /usr/share/i18n/SUPPORTED
#
# Whenever glibc is emerged, the locales listed here will be automatically
# rebuilt for you.  After updating this file, you can simply run `locale-gen`
# yourself instead of re-emerging glibc.

#en_US ISO-8859-1
#en_US.UTF-8 UTF-8
#ja_JP.EUC-JP EUC-JP
#ja_JP.UTF-8 UTF-8
#ja_JP EUC-JP
#en_HK ISO-8859-1
#en_PH ISO-8859-1
#de_DE ISO-8859-1
#de_DE@euro ISO-8859-15
#es_MX ISO-8859-1
#fa_IR UTF-8
#fr_FR ISO-8859-1
#fr_FR@euro ISO-8859-15
#it_IT ISO-8859-1
ares /home/jlpoole #

I did not have any activated. So I un-rem'd en_US.UTF-8 UTF-8 and then ran locale-gen as specified in the header of /etc/locale.gen. Then, in Gentoo, you have to select an available locale, so I listed the choices:

ares /home/jlpoole # eselect locale list
Available targets for the LANG variable:
  [1]   C
  [2]   C.utf8
  [3]   POSIX
  [4]   en_US.utf8
  [ ]   (free form)
ares /home/jlpoole #

Then I select en_US.utf8:

 ares /home/jlpoole # eselect locale set 4
 Setting LANG to en_US.utf8 ...
 Run ". /etc/profile" to update the variable in your shell.
 ares /home/jlpoole #

Then in the shell(s) I run . /etc/profile and then I reran the sample with the following almost successful results. I say "almost successful" because the box characters are not properly mapping, but at least now the interface is readable. putty_2020-05-29_20-46-31 Here are the results for other locale setting, just for reference: putty_2020-05-29_20-44-55 putty_2020-05-29_20-43-04 putty_2020-05-29_20-41-22

gyscos commented 4 years ago

Thanks for the investigation!

ncurses has some special logic to handle these ACS characters on non-UTF8 locale. This is non-trivial though, and it was decided for now to simply not support non-UTF8 locales in cursive.

To double-check that eselect and others worked fine, you can run locale -a to see what locales are available, and locale to see which one is activated.

Regarding PuTTY itself, it should be properly supported: I just tried to connect to a server of mine: Screenshot from 2020-06-02 10-01-28

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES=
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Though I did not enable VT100 line drawing: Screenshot from 2020-06-02 10-04-08

(That being said it still works fine for me even with this option enabled.)

Can your PuTTy display the UTF8 characters themselves? I just added a doc/test.txt you can try to print in the terminal, to see if the box is at least printed correctly.

gyscos commented 4 years ago

In addition, can you try the crossterm or termion backend to see if it works better?

gyscos commented 4 years ago

Your situation actually looks like this: https://github.com/alacritty/alacritty/issues/2319#issuecomment-485685962

Maybe the font is the problem?

grahamc commented 3 years ago

I'm trying to use cursive in stage-1 of an OS's boot phase at a Linux console. It caused all the ~T~@ issues, so I patched out the special characters for my own use: https://github.com/grahamc/cursive/commit/7c3f31103201cd1e7a4c77ba323758800abf7882