Closed wtsnjp closed 1 year ago
It sounds completely sensible to me to fall back to looking at LANG if LC_* values are unusable, whether absent or invalid. I see nothing to be lost, and it seems far better for users if texdoc tries as best it can to guess their intended locale.
By the way, GNU gettext also supports a LANGUAGE envvar, overriding all the others, but I rarely see this used nowadays. https://www.gnu.org/software/gettext/manual/gettext.html#Locale-Environment-Variables
Best, Karl
Ok, then I will go for that. Thanks!
As this thread shows there is much more broken regarding language selection than what the OP reports...
The system locale obtained with os.setlocale()
seems more static than those obtained from environment variables. I will change Texdoc to check the environment variables first and then use the os.setlocale()
as a fallback. Borrowing the specification of GNU gettext, the priority list will be:
LANGUAGE_texdoc
LANGUAGE
LC_ALL
LANG
os.setlocale()
To calculate a better score for the found documents, Texdoc tries to get the system locale. At this moment, this feature completely relies on the
os.setlocale()
function (which internally calls thesetlocale()
function of C). In addition to this, I am now thinking to check theLANG
variable and set thelang
configuration from its value only if Texdoc fails to get the locale information fromos.setlocale()
.With the current implementation, Texdoc sometimes fails to get the "expected" locale. We got multiple reports claiming that Texdoc does not recognize any locale even though they set the LANG variable (see https://github.com/TeX-Live/texdoc/issues/76#issuecomment-1072306460 and mailing list.)
Not surprisingly, it seems the behavior of the setlocale() function heavily depends on the platform. The exact precedence of the related variables and the values of such variables can differ among platforms. Notably, sometimes the
setlocale()
function returns the value that Texdoc cannot interpret (e.g.,Japanese_Japan.932
on Windows. Texdoc only supports the values starting with 2-letter language code likeja_JP.UTF-8
.)I would rather want to follow the convention of Unix tools for this locale setting, so I checked IEEE Std 1003.1, 2004 Edition:
It says we can consider
LANG
ifLC_*
are absent. I wonder what ifLC_ALL
exists but its value is invalid (in the sense of Texdoc.) Should we consider the value ofLANG
variable when we cannot get a valid language code from theos.setlocale()
function?@kberry @norbusan do you have any suggestions on this?