sakaki- / gentoo-on-rpi-64bit

Bootable 64-bit Gentoo image for the Raspberry Pi4B, 3B & 3B+, with Linux 5.4, OpenRC, Xfce4, VC4/V3D, camera and h/w codec support, weekly-autobuild binhost
GNU General Public License v3.0
918 stars 126 forks source link

Issues with displaying special characters #137

Closed TheChymera closed 4 years ago

TheChymera commented 4 years ago

For reasons neither I nor the helpful folk in #gentoo could discern, using this image I am not able to properly display special characters as my user (but I can as root). See the below paste for diagnostics and the actual issue (last two lines):

chymera@mediahost ~ $ cat /etc/env.d/02locale 
# Configuration file for eselect
# This file has been automatically generated.
#LANG="en_GB.utf8"
LANG="en_GB.UTF-8"
LC_COLLATE="C"
#LC_ALL="en_GB.UTF-8"
chymera@mediahost ~ $ eselect locale list
Available targets for the LANG variable:
  [1]   C
  [2]   C.utf8
  [3]   POSIX
  [4]   en_GB.utf8
  [5]   en_GB.UTF-8 *
  [ ]   (free form)
chymera@mediahost ~ $ source /etc/profile
chymera@mediahost ~ $ locale -a
C
C.utf8
POSIX
en_GB.utf8
chymera@mediahost ~ $ namei -l /usr/lib*/locale/locale-archive
f: /usr/lib64/locale/locale-archive
drwxr-xr-x root root /
drwxr-xr-x root root usr
drwxr-xr-x root root lib64
drwxr-xr-x root root locale
-rw-r--r-- root root locale-archive
chymera@mediahost ~ $ declare -p LC_ALL
-bash: declare: LC_ALL: not found
chymera@mediahost ~ $ type ls
ls is aliased to `ls --color=auto'
chymera@mediahost ~ $ locale -a
C
C.utf8
POSIX
en_GB.utf8
chymera@mediahost ~ $ locale
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC=en_US.utf8
LC_TIME=en_US.utf8
LC_COLLATE=C
LC_MONETARY=en_US.utf8
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER=en_US.utf8
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT=en_US.utf8
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=
chymera@mediahost ~ $ env | egrep '^LANG|LC_'
LC_MONETARY=en_US.utf8
LC_PAPER=en_US.utf8
LANG=en_GB.UTF-8
LC_MEASUREMENT=en_US.utf8
LC_TIME=en_US.utf8
LC_COLLATE=C
LC_NUMERIC=en_US.utf8
chymera@mediahost ~ $ mkdir ö
chymera@mediahost ~ $ ls
 old_rpi  ''$'\303\266'

The consensus seems to be that something may be wrong with the image.

ghost commented 4 years ago

Just to put this in context, GNU ls has a default quoting style of "shell-escape" where STDOUT is a tty. This quoting style can cause the UTF-8 representation of U+00F6 to be displayed as ''$'\303\266' under specific conditions. These conditions are:

  1. the effective value of LC_CTYPE must refer to either a non-utf8 locale or an invalid/uninstalled locale
  2. the default quoting style is not contradicted by the --quoting-style option or QUOTING_STYLE environment variable

Alternately, a name that consists of bytes that do not decode correctly as UTF-8 may cause this.

None of these conditions apply to the OP. This - and the fact that the root user account is not affected - makes this an irregularity, and one that is not readily reproducible in a normal Gentoo system.

Here is a test case:

$ touch $'\303\266'
$ locale            # make sure LC_CTYPE refers to a valid/installed UTF-8 locale
$ ls                # should render character in its correct form (ö)
$ LC_CTYPE=C ls     # should print ''$'\303\266' instead
ghost commented 4 years ago

Figured it out (thanks, redsh). Locale handling will fail if any of the supported environment variables refer to an invalid locale. First of all, that makes this bug invalid. Secondly, the OP should either not mix in en_US.utf8 or ensure that the locale is generated before trying to do so.

TheChymera commented 4 years ago

@kerframil so what's the solution? I can't find en_US.utf8 specified anywhere in my config:

mediahost /etc # ag en_US
config-archive/etc/locale.gen.dist.new
23:#en_US ISO-8859-1
24:#en_US.UTF-8 UTF-8

config-archive/etc/locale.gen
23:#en_US ISO-8859-1
24:#en_US.UTF-8 UTF-8

locale.gen
23:#en_US ISO-8859-1
24:#en_US.UTF-8 UTF-8
ghost commented 4 years ago

@kerframil so what's the solution? I can't find en_US.utf8 specified anywhere in my config:

As we later realised by way of an IRC discussion, the issue stems from this idiotic change in Gentoo's openssh package. The appropriate solution was not to accept the applicable client-specified variables in /etc/ssh/sshd_config.

TheChymera commented 4 years ago

@kerframil thank you for figuring this out!