Closed mogoh closed 10 months ago
Whats the output of locale
outside and inside the container?
This:
$ locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=
This seems like a configuration issue on your Linux distribution and not related to toolbox nor the Arch Linux container.
LC_ALL
shouldn't be empty and it seems like the locale hasn't been properly generated on the install.
huh. Strange. I'll investigate. Thanks for the hint.
Are you sure, that LC_ALL
should not be empty?
Because where I look it seems quite common to have it empty.
The issue isn't necessarily that LC_ALL
is empty, there are fallback locales. However the problem is the three errors that locale
is displaying.
Oh, I misread your comment, sorry. My locale
on the host is different then inside the container.
$ locale
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=
$ locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=
This is probably the same issue for the official fedora-toolbox image: https://github.com/containers/toolbox/issues/60
What they did is adding:
RUN dnf -y swap glibc-minimal-langpack glibc-all-langpacks
I don't know what the arch equivalent would be.
Arch doesn't pre-built languages so there is no Arch equivalent. Should probably hardcode the LC_* to C.UTF-8
inside the containers I think.
Sorry, but I don't know how to do that.
This is the /etc/locale.conf
inside the container:
$ cat /etc/locale.conf
LANG=C.UTF-8
I also tried changing it to plain LANG=C
but it did not help.
When I inspect the running container, I find that this are the environment variables:
"Env": [
"TOOLBOX_PATH=/usr/bin/toolbox",
"XDG_RUNTIME_DIR=/run/user/1000",
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm",
"container=podman",
"LANG=C.UTF-8",
"HOSTNAME=toolbox",
"HOME=/var/home/mogoh"
],
So I don't know where the locals are actually set.
I was playing with the Arch Linux image before merging it as one of the supported ones, and I noticed this too.
This is probably the same issue for the official fedora-toolbox image: https://github.com/containers/toolbox/issues/60
Yes, that was my immediate thought too. Although if you see my comments on that issue, you will realize that I don't have a very deep understanding of all this works.
What they did is adding:
RUN dnf -y swap glibc-minimal-langpack glibc-all-langpacks
In Fedora's glibc packaging, glibc-minimal-langpack
is an empty package. It contains no files, but only metadata (Provides: glibc-langpack
) that indicates the availability of some language packs. Specifically, the C
, POSIX
and C.UTF-8
locales, which are already built into the main glibc package, and can be used to satisfy the glibc-langpack
requirement of other packages.
The glibc-all-langpacks
is the one that contains all the other locales (like en_US.UTF-8
) that people actually use, and it also has the same metadata (Provides: glibc-langpack
).
Hence the packages can be interchanged without breaking dependencies elsewhere.
Arch doesn't pre-built languages so there is no Arch equivalent.
Is there a way for the user to later on install their desired locale? I did notice that the installer made me select the locales that I wanted to be available for use.
I know very little about how Arch works. That's why I am asking.
Maybe Toolbx can do some magic to make the locales from the host available to the container, but I don't know that will work across different glibc versions.
Should probably hardcode the LC_* to
C.UTF-8
inside the containers I think.
Wouldn't en_US.UTF-8
be a better choice? It has become the de facto default for user interfaces, and hence might be the least surprising as an arbitrary starting point.
Is there a way for the user to later on install their desired locale? I did notice that the installer made me select the locales that I wanted to be available for use.
No, they need to be generated by locale-gen
.
Maybe Toolbx can do some magic to make the locales from the host available to the container, but I don't know that will work across different glibc versions.
You can add the locate into /etc/locale.gen
and run locale-gen
.
Wouldn't en_US.UTF-8 be a better choice? It has become the de facto default for user interfaces, and hence might be the least surprising as an arbitrary starting point.
I'm not sure what the actual difference of C.UTF-8
and en_US.UTF-8
has on user interfaces really.
Is there a way for the user to later on install their desired locale? I did notice that the installer made me select the locales that I wanted to be available for use.
No, they need to be generated by
locale-gen
.Maybe Toolbx can do some magic to make the locales from the host available to the container, but I don't know that will work across different glibc versions.
You can add the locate into
/etc/locale.gen
and runlocale-gen
.
I see, thanks.
Wouldn't en_US.UTF-8 be a better choice? It has become the de facto default for user interfaces, and hence might be the least surprising as an arbitrary starting point.
I'm not sure what the actual difference of
C.UTF-8
anden_US.UTF-8
has on user interfaces really.
I would expect C.UTF-8
and en_US.UTF-8
to both show the same strings in most cases, because programmers default to American English in their strings. The difference will show in things like how dates and numbers are formatted, paper sizes, units of measurement, etc.. I have no idea what C.UTF-8
does for those, but with en_US.UTF-8
they should correspond to the standards in the US.
On my Arch Linux host with 16 extra locales on top of C
, C.UTF-8
and POSIX
, the size of /usr/lib/locale/locale-archive
is 6.4M. In comparison, on Fedora, with 866 extra locales, the size is 214M. I suspect that the size penalty will be pretty low if we added just en_US.UTF-8
on top of the 3 that are built into glibc.
I think adding it to the container is fine :)
When I inspect the running container, I find that this are the environment variables:
"Env": [ "TOOLBOX_PATH=/usr/bin/toolbox", "XDG_RUNTIME_DIR=/run/user/1000", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "TERM=xterm", "container=podman", "LANG=C.UTF-8", "HOSTNAME=toolbox", "HOME=/var/home/mogoh" ],
That "LANG=C.UTF-8"
is coming from the Arch Linux base image.
$ podman pull docker.io/library/archlinux:base-devel
...
$ podman inspect --format '{{ .Config.Env }}' --type image docker.io/library/archlinux:base-devel
[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin LANG=C.UTF-8]
Maybe Toolbx can do some magic to make the locales from the host available to the container, but I don't know that will work across different glibc versions.
Sadly that magic can't be a straight bind mount of /usr/lib/locale
from the host to the container. See: https://bugzilla.redhat.com/show_bug.cgi?id=956993 and https://sourceware.org/legacy-ml/libc-alpha/2013-04/msg00676.html and https://bugzilla.gnome.org/show_bug.cgi?id=698383
Thanks to @halfline for those references.
I spent some time over the past few days digging into this.
First, containers created from the ubuntu-toolbox
image also suffer from this same problem, in the sense that they only have the C
, C.UTF-8
and POSIX
locales inside, because Ubuntu follows the same approach with /etc/locale.gen
and locale-gen(8)
as Arch Linux.
However, locale(1)
doesn't throw any errors inside the containers because the shell start-up files on Ubuntu sanitize the environment. Specifically:
$ cat /etc/profile.d/01-locale-fix.sh
# Make sure the locale variables are set to valid values.
eval $(/usr/bin/locale-check C.UTF-8)
In this case, that snippet will set LANG
to C.UTF-8
, which is why locale(1)
doesn't complain.
/cc @jmennius and @andrewshadura
Maybe Toolbx can do some magic to make the locales from the host available to the container, but I don't know that will work across different glibc versions.
Sadly that magic can't be a straight bind mount of
/usr/lib/locale
from the host to the container. See: https://bugzilla.redhat.com/show_bug.cgi?id=956993 and https://sourceware.org/legacy-ml/libc-alpha/2013-04/msg00676.html and https://bugzilla.gnome.org/show_bug.cgi?id=698383
I have one potential solution that will only work for Arch Linux containers running on Arch Linux hosts or Ubuntu on Ubuntu.
toolbox(1)
will ensure that the container's /etc/locale.gen
is kept synchronized with the host's and use an inotify(7)
watch to detect changes to the host's /etc/locale.gen
. We already do this for /etc/localtime
, so that's easy. When there's a change on the host, we run locale-gen(8)
inside the container to update /usr/lib/locale/locale-archive
.
I am discussing this with the GNU C Library folks to be sure that we explore all possible options and settle for the best possible one.
Debian and Ubuntu have locales-all that should cover most cases already.
(Replying from my phone, sorry for being very concise.)
Debian and Ubuntu have locales-all that should cover most cases already.
Interesting. I didn't know about locales-all
-- I am still learning about how things work outside the Fedora family.
I see that it uses the per-locale sub-directories in /usr/lib/locale
instead of the mmap-able /usr/lib/locale/locale-archive
blob. Why is that?
The size of the files in /usr/lib/locale
is 229M. Do you think it will be alright to include locales-all
in the ubuntu-toolbox
images?
We include the equivalent of locales-all
in the fedora-toolbox
images (ie., glibc-all-langpacks
, but it uses the mmap-able blob, not the per-locale sub-directories). That's how Fedora Silverblue and Workstation hosts are configured. So, other than increasing the sizes of the images by 200 odd megabytes, it has the advantage of exactly matching the host's configuration.
I am wondering why Ubuntu Desktop doesn't use locales-all
, and what the intended purpose of this package is? Since it doesn't use the faster mmap-able blob, I am wondering if this might have some performance impact.
On one hand, the less we do at run-time in toolbox(1)
with /etc/locale.gen
and locale-gen(8)
, the more robust and testable things are, but on the other there's the increased image size and potential performance concerns.
I am wondering why Ubuntu Desktop doesn't use
locales-all
, and what the intended purpose of this package is?
Typical Debian and Ubuntu installs don’t use it because it’s extra space, and also because you usually know what locales users most likely want to use.
As for the purpose of locales-all
, the package description states it :slightly_smiling_face:
Description: GNU C Library: Precompiled locale data This package contains the precompiled locale data for all supported locales. A better alternative is to install the locales package and only select desired locales, but it can be useful on a low-memory machine because some locale files take a lot of memory to be compiled.
I guess instead of doing inotify, toolbox could (on create? enter?) verify locales in /etc/locale.gen
work (e.g. by trying to setlocale
), and if some don’t, generate them. In fact, reading locale-gen
’s source code, apparently this will happen if you run locale-gen --keep-existing
.
I am wondering why Ubuntu Desktop doesn't use
locales-all
, and what the intended purpose of this package is?Typical Debian and Ubuntu installs don’t use it because it’s extra space, and also because you usually know what locales users most likely want to use.
It sounds like Ubuntu Desktop is a lot more disk space sensitive compared to Fedora Silverblue and Workstation, whereas the Fedora family has an explicit desire to avoid building things on the user's systems and instead ship pre-built and tested locales. Different groups, different philosophies and trade-offs, I guess. :)
So, I am assuming that we don't want to include locales-all
in the ubuntu-toolbox
images, because if it's too big for the Ubuntu Desktop ISO, then it's likely too big for the OCI image. This is, of course, just my (temporary) assumption for the rest of this comment. You are free to do otherwise, in which case, the problem goes away. :)
As for the purpose of
locales-all
, the package description states it slightly_smiling_face
I was hoping to dig up the historical background behind the way things are done in (Debian and) Ubuntu. For example, here are some more historical references from GNOME and Fedora:
I guess instead of doing inotify, toolbox could (on create? enter?) verify locales in
/etc/locale.gen
work (e.g. by trying tosetlocale
), and if some don’t, generate them.
You mean the host's or the container's /etc/locale.gen
?
I saw that Ubuntu has a downsteam Settings patch that adds a Manage Installed Languages button to the Region & Language panel that Arch Linux doesn't have. I need to check exactly what that code does. So far, my assumption is that it edits /etc/locale.gen
and runs locale-gen(8)
.
The plan that I described earlier would involve the container's entry point, which is toolbox init-container
, doing these when entering a container for the first time with toolbox enter
:
Make the container's /etc/locale.gen
a symbolic link to /run/host/etc/locale.gen
, which is the host's copy of the file.
Set an inotify(7)
watch on /run/host/etc/locale.gen
. Whenever there's a change, run locale-gen(8)
inside the container to update the container's /usr/lib/locale/locale-archive
.
We could allow the user to override all this by breaking the symbolic link from /etc/locale.gen
to /run/host/etc/locale.gen
.
In fact, reading
locale-gen
’s source code, apparently this will happen if you runlocale-gen --keep-existing
.
By the way, Arch Linux's locale-gen is a lot more stripped down than (Debian's and) Ubuntu's. :)
The plan that I described earlier would involve the container's entry point, which is
toolbox init-container
, doing these when entering a container for the first time withtoolbox enter
Oh yeah, that’s what I meant, I just wasn’t sure inotify is actually needed or just generating missing locales on the first enter would be enough… it’s not like users are likely to change locales every day?
By the way, Arch Linux's locale-gen is a lot more stripped down than (Debian's and) Ubuntu's. :)
Yes, Ubuntu have extended Debian’s, but for some reason their changes were not pushed back to Debian (although they tried at least once).
The plan that I described earlier would involve the container's entry point, which is
toolbox init-container
, doing these when entering a container for the first time withtoolbox enter
Oh yeah, that’s what I meant, I just wasn’t sure inotify is actually needed or just generating missing locales on the first enter would be enough… it’s not like users are likely to change locales every day?
If you look for fsnotify.NewWatcher
in src/cmd/initContainer.go then you will see the current code for timezone handling. It's pretty simple.
I am worried about race conditions that may abort the localedef(1)
process invoked by locale-gen(8)
and negative fallout from that. People expect to have to log out after adding new locales. What if that aborts an ongoing localedef(1)
process inside a Toolbx container? Do we risk ending up with a corrupt /usr/lib/locale/locale-archive
because of that?
This can also happen if we update /usr/lib/locale/locale-archive
when entering the container, because, at least in theory, the user may log out whenever they want.
I want to fully understand this before writing the code. :)
By the way, Arch Linux's locale-gen is a lot more stripped down than (Debian's and) Ubuntu's. :)
Yes, Ubuntu have extended Debian’s, but for some reason their changes were not pushed back to Debian (although they tried at least once).
I see. I didn't know that Ubuntu extended Debian's copy.
Yes, updating it is tricky, I’ve seen some reports (see here) that updating locales causes apps to fail while the file is being written to. An alternative would be to compile locales into a different non-default file, and then atomically replace it?
Another alternative: pass --no-archive
to localedef
and let it create directories (also probably elsewhere initially, then atomically move to the right place).
Closing as we'll soon withdraw the Arch Linux images in favor of the upstream ones from the toolbx project: https://github.com/toolbx-images/images/pull/82
If you can reproduce this issue with those images, please file an issue there. Thanks.
Image and version of the image where the issue happens
quay.io/toolbx-images/archlinux-toolbox:latest
Describe the bug
When I enter the arch linux image I get said error. But entering works anyway.
Reproduction steps
Host distribution and version, toolbx and podman versions