wxWidgets / wxWidgets

Cross-Platform C++ GUI Library
https://www.wxwidgets.org/
6.12k stars 1.77k forks source link

wxOSX does not fall back properly when setting default language #23114

Closed sethhillbrand closed 1 year ago

sethhillbrand commented 1 year ago

Description

When MacOS system is set to use English as its language and a region that is not provided by the locale database (e.g. Germany), calling wxLocale::Init( wxLANGUAGE_DEFAULT ) will attempt to set the language to en_DE even though this language does not exist.

Expected behavior

According to the documentation,

    wxLANGUAGE_DEFAULT has special meaning: best suitable translation,
    given user's preference and available translations, will be used.

I would expect that if the language/region combination does not exist that the first available matching language would be used regardless of region. If no matching language is available, I would expect that the base translation (usually english) would be used.

I would never expect that we would see an error message when setting the locale to wxLANGUAGE_DEFAULT

To Reproduce

Set language and region like the following: image

Then run intltest.cpp and set the language to wxLANGUAGE_DEFAULT

Platform and version information

vadz commented 1 year ago

FWIW I think it does make sense to try en_DE on such system, but we should definitely fall back on en_US if setting it fails. I'll check why it doesn't happen currently...

sethhillbrand commented 1 year ago

Agreed. Only if the language/region combination does not exist should it choose a different one. Although it should probably choose the same language first, so that fr_BE falls back to fr_FR before en_US.

vadz commented 1 year ago

TL;DR: There is a minor problem in wxLocale but I think the real solution is to just stop using it in your code and use wxUILocale, it was added partially in order to make dealing with such less standard situations simpler and more logical.


Thinking more about this, I think we definitely need to change one thing here:

However I'm less sure if we should do this:

because this might not work correctly, e.g. under Linux you could do LC_ALL=de_CH ./myprogram and expect the program to use period as decimal separator, but if we fell back on just de, the program would be using a comma. Also, I strongly suspect there are languages supported at macOS UI level that are not supported at the POSIX locale level at all, so even doing this wouldn't guarantee that setlocale() succeeds. And ignoring its failure definitely wouldn't be the right thing to do.

Of course, this probably means that wxLocale::IsAvailable(wxLANGUAGE_DEFAULT) could still return false which might seem unexpected. But this seems to be the correct behaviour for wxLocale. Of course, if you're only interested in the UI aspects, you should be using wxUILocale (and, in particular, its UseDefault() function) and if you only want to use translations, then you should be using wxTranslations with wxUILocale::GetSystemLanguage() which already does work correctly in this situation (and returns wxLANGUAGE_ENGLISH).

@utelle Please let me know if you have any comments about this.

vadz commented 1 year ago

I've created #23119 with the first proposed change above and some other minor improvements, notably to the docs.

I have to admit that I'm still not completely sure if the behaviour is correct, but I still believe that the only obviously correct solution is to use wxUILocale...

utelle commented 1 year ago

TL;DR: There is a minor problem in wxLocale but I think the real solution is to just stop using it in your code and use wxUILocale, it was added partially in order to make dealing with such less standard situations simpler and more logical.

In new code it is definitely advisable to use wxUILocale.

Thinking more about this, I think we definitely need to change one thing here:

* We must handle `IsAvailable(wxLANGUAGE_DEFAULT)` specially as using the default locale may succeed even if we don't recognize it.

PR https://github.com/wxWidgets/wxWidgets/pull/23119 applies the necessary changes to make calling wxLocale::IsAvailable(wxLANGUAGE_DEFAULT) operational, even if the actual default language (like non-standard en_DE) is not included in the wx language database.

However I'm less sure if we should do this:

* Try to chop off the part after `_` in the locale name and retry using it as `wxSetlocale()` argument if `wxSetlocale(LC_ALL, "")` fails.

because this might not work correctly, e.g. under Linux you could do LC_ALL=de_CH ./myprogram and expect the program to use period as decimal separator, but if we fell back on just de, the program would be using a comma. Also, I strongly suspect there are languages supported at macOS UI level that are not supported at the POSIX locale level at all, so even doing this wouldn't guarantee that setlocale() succeeds. And ignoring its failure definitely wouldn't be the right thing to do.

I fully agree. Trying to deduce a locale in that way would most likely cause unexpected behaviour. For language id wxLANGUAGE_DEFAULT the system usually should support setting the default locale (i.e. setlocale(LC_ALL, "")) - if it does not, there is not much we can do about it.

Of course, this probably means that wxLocale::IsAvailable(wxLANGUAGE_DEFAULT) could still return false which might seem unexpected. But this seems to be the correct behaviour for wxLocale.

Exactly, if setlocale(LC_ALL, "") fails, then probably there is some system misconfiguration. And IMHO it is better to report an error than to wildly guess what might be intended.

Of course, if you're only interested in the UI aspects, you should be using wxUILocale (and, in particular, its UseDefault() function) and if you only want to use translations, then you should be using wxTranslations with wxUILocale::GetSystemLanguage() which already does work correctly in this situation (and returns wxLANGUAGE_ENGLISH).

AFAICT wxUILocale works correctly even for non-standard languages in respect to UI language translation.

sethhillbrand commented 1 year ago

We'll try out using wxTranslations and wxUILocale instead of using wxLocale for everything.

Note, however that this behavior is different from what is seen on other platforms. Setting the linux locale to en_DE or similar non-existent language causes wxGTK to correctly fall back to english, while setting fr_BE will fall back to french. This is the behavior I would expect but I understand if you consider the bug in the GTK port and not the Mac port.

vadz commented 1 year ago

Thanks, you're right that there is a difference between the behaviour under Mac and Linux, but it does look like we have a bug in our non-Mac Unix wxLocale implementation because we pretend that we recognize the locale when it's set to fr_BE, but actually we don't (unless the locale is known to the system, of course, which could well be the case for fr_BE, but never for fr_US, for example) and just use "C" locale, which is the same as en_US, in this case.

I'd like to fix this, but, first, to be sure I am not missing something, I'd like to ask what exactly are you doing where you think that this fall back does happen? Could you please summarize it briefly or just point me to your code in KiCad? TIA!

sethhillbrand commented 1 year ago

The KiCad locale is set at https://gitlab.com/kicad/code/kicad/-/blob/master/common/pgm_base.cpp#L612

Here's an example of my test on Linux:

Set to French as spoken in Qatar image

Then setting to Greek, but italian-type image

vadz commented 1 year ago

Sorry for the delay with replying, I still have some trouble understanding what's going on here. But unfortunately I'm rather sure for now that things are very broken in the Linux version. In particular, we always return "English (US)" from wxUILocale::GetSystemLanguage() there no matter what, which just can't be right. We also always "succeed" in setting the locale, even if you set LC_ALL=xx_YY, because we actually set it to "C"... This seems wrong and the latter is also regression compared to 3.0 (3.2 unfortunately behaves in the same way as the current master).

I don't understand how things got so wrong, but right now I see at least the following problems:

  1. wxLocale::Init(wxLANGUAGE_DEFAULT) returns true even if LC_ALL is set to a completely unknown or even not existing language.
  2. wxUILocale::GetSystemLanguage() always returns US English, making it completely useless.
  3. wxLocale::GetSystemLanguage() returns US English now if the locale is not supported by the system, even if it could be (and was, in 3.0) recognized by wx, e.g. setting LC_ALL=de on my system and running the test below results in "German" with 3.0 but "English (US)" with 3.2/master (and, due to (1), I also get an error with 3.0 but not with 3.2/master).

I'm going to try to fix this because we clearly broke things here. @utelle Please let me know if I'm missing something.

Also, one more question to @sethhillbrand: which version of wx does the KiCad build shown above use? Is it 3.2 or still 3.0?

Simple test program that can be compiled with both 3.0 and 3.2 ```cpp #include #include #include #if wxCHECK_VERSION(3,2,0) #define HAS_UILOCALE #include #endif void ShowLang(const char* label, int lang) { wxPrintf("%s: %s\n", label, wxLocale::GetLanguageName(lang)); } int main(int argc, char **argv) { wxInitializer init; if ( !init.IsOk() ) { fprintf(stderr, "Failed to initialize wxWidgets.\n"); return 3; } ShowLang("System language", wxLocale::GetSystemLanguage()); #ifdef HAS_UILOCALE ShowLang("From wxUILocale", wxUILocale::GetSystemLanguage()); #endif wxLocale loc; if ( !loc.Init(wxLANGUAGE_DEFAULT, wxLOCALE_DONT_LOAD_DEFAULT) ) { wxPrintf("Initializing locale failed.\n"); return 1; } ShowLang("System lang now", wxLocale::GetSystemLanguage()); #ifdef HAS_UILOCALE ShowLang("From wxUILocale", wxUILocale::GetSystemLanguage()); #endif return 0; } ```
sethhillbrand commented 1 year ago

@vadz The example program I am showing above was wx 3.2

vadz commented 1 year ago

Thanks! So we have both regressions since 3.0 under Linux (because everything I wrote above was for Linux/non-Darwin Unix only) which don't affect KiCad and some difference in behaviour under Linux and Mac in 3.2, which does.

It's still not totally clear to me what is breaking KiCad and I'm a bit afraid that by fixing the problems above under Linux I might actually make things worse for it (although if it also worked with 3.0, hopefully not if I make 3.2 more compatible with it). But they still seem to be worth fixing.

And I suspect that KiCad actually just doesn't care about the locale being not set, i.e. you must still be using decimal point (and not comma) with fr_QA locale, which is, in theory, wrong, but you just use it to for loading the translations and not for full localization. If true, this is another reason to just stop using it entirely and use wxTranslations directly instead.

sethhillbrand commented 1 year ago

We would like to stop using it. Unfortunately, we get issues like https://gitlab.com/kicad/code/kicad/-/issues/11046 when we don't call wxLocale::Init() early enough in the code.

We've tried substituting wxUILocale (https://gitlab.com/kicad/code/kicad/-/merge_requests/1444#note_1234841165) but without success.

You are correct that setting LANG doesn't affect the decimal point, so invalid combinations seem to result in always using a decimal instead of a comma.

utelle commented 1 year ago

I still have some trouble understanding what's going on here. But unfortunately I'm rather sure for now that things are very broken in the Linux version.

I have not yet inspected the pre-3.0 wxLocale code, but I spent quite some time on running the 3.x wxUILocale code in a debugger today. As far as I can tell the problem arises from the fact that wxUILocale calls newlocale on instantiating the default locale (in the constructor of wxUILocaleImplUnix called from CreateUserDefault()). If newlocale fails the default locale of the application is used to determine the locale name later on. Since any C++ application starts with a C locale, the locale name will be "C" in that case.

In particular, we always return "English (US)" from wxUILocale::GetSystemLanguage() there no matter what, which just can't be right.

I disagree. "en_US" is only returned, if newlocale fails using the default locale (""). I tested this by setting the environment variable LC_ALL. Setting LC_ALL=en_DE or LC_ALL=xx_YY lets newlocale fail - at least on my system where these locales simply do not exist. Therefore wxUILocale falls back to the C locale.

We also always "succeed" in setting the locale, even if you set LC_ALL=xx_YY, because we actually set it to "C"... This seems wrong and the latter is also regression compared to 3.0 (3.2 unfortunately behaves in the same way as the current master).

3.0 was based on wxLocale, while 3.1 introduced wxUILocale and wxLocale was changed to use wxUILocale methods internally. As said above I haven't inspected the old wxLocale code. However, setlocale that is used by wxLocale will also fail for unknown values for LC_ALL, but probably wxLocale interpreted the LC_ALL value nevertheless.

I don't understand how things got so wrong, but right now I see at least the following problems:

  1. wxLocale::Init(wxLANGUAGE_DEFAULT) returns true even if LC_ALL is set to a completely unknown or even not existing language.

Well, this is based on the assumption that setting the default locale (setlocale(LC_ALL, "")) will always work, but that is not the case for unknown values.

  1. wxUILocale::GetSystemLanguage() always returns US English, making it completely useless.

If LC_ALL is set to a valid value, for example LC_ALL=de_DE on my system, wxUILocale::GetSystemLanguage() returns the correct value. US English is returned only for non-existing locales.

  1. wxLocale::GetSystemLanguage() returns US English now if the locale is not supported by the system, even if it could be (and was, in 3.0) recognized by wx, e.g. setting LC_ALL=de on my system and running the test below results in "German" with 3.0 but "English (US)" with 3.2/master (and, due to (1), I also get an error with 3.0 but not with 3.2/master).

It always burns down to the same issue: LC_ALL=de specifies a locale that can't be set via setlocale or created via newlocale. And therefore wxUILocale falls back to C resulting in "en_US".

I'm going to try to fix this because we clearly broke things here. @utelle Please let me know if I'm missing something.

First, we have to define how wxLocale and wxUILocale should behave when the user specifies an unknown or invalid locale identifier that can't be used in setlocale or newlocale. Obviously, the assumption that the default locale (identified by "") is always available and settable, is false.

It is certainly possible to restore the old wxLocale behaviour by adjusting the code (and probably not using wxUILocale methods). However, the goal was that the use of wxLocale is no longer necessary in new code.

IMHO it is necessary to define the behaviour in such a way that wxLocale based results are consistent with wxUILocale based results.

vadz commented 1 year ago

We would like to stop using it. Unfortunately, we get issues like https://gitlab.com/kicad/code/kicad/-/issues/11046 when we don't call wxLocale::Init() early enough in the code.

I think using wxUILocale::UseDefault() should fix this issue too. And it's the only way to do it when the locale is set to something like ja_US (i.e. anything other than ja_JP) under macOS, wxLocale just can't handle this as it's based on POSIX APIs and ja_US is not a supported POSIX locale under macOS.

We've tried substituting wxUILocale (https://gitlab.com/kicad/code/kicad/-/merge_requests/1444#note_1234841165) but without success.

Sorry, I'm not sure why exactly do you say it was without success, reading the comments there it seems like it did fix the problem under Mac?

Just to be clear: to use the correct locale (including, but not limited to, using the correct language) in macOS UI, e.g. show Japanese labels in the message dialogs, file dialogs etc, whenever the user language is set to Japanese, you must use wxUILocale. This does not change C locale (i.e. the one set using setlocale()) under Mac. Using wxUILocale does change C locale under Linux because there GTK itself uses C locale as "UI" locale, so both are the same. The conclusion is that if you need to change C locale under Mac too, you need to do it in addition to using wxUILocale. But ideally you wouldn't do it and just avoid using any functions depending on C locale because they're inherently not portable (I regret this as much as anybody else, but there is nothing we can do about Apple's decisions).

You are correct that setting LANG doesn't affect the decimal point, so invalid combinations seem to result in always using a decimal instead of a comma.

Yes, so the locale is not actually being set at all. AFAICS KiCad code doesn't (always) check wxLocale::Init() return value, but it returns false in this case.

vadz commented 1 year ago

If LC_ALL is set to a valid value, for example LC_ALL=de_DE on my system, wxUILocale::GetSystemLanguage() returns the correct value. US English is returned only for non-existing locales.

@utelle I found out why this didn't work for me: this function actually uses LANGUAGE as primary source of information, so if it's set (and it is, to en_US:en value, in my case, as this is its value in /etc/default/locale on my Debian Bookworm system), then changing LC_ALL or LANG doesn't affect it at all.

I'm not sure if this is the right thing to do, even if GNU manual says that it is, because, according to the same manual, LANGUAGE is used to specify the fallback translations and its first element is supposed to be the same as LANG. So I believe that a more logical and expected behaviour would be to take the first language from LANG and/or LC_ALL and then append the one from LANGUAGE to it (discarding duplicates).

@vslavik Would you know more about this by chance (I don't expect you to read all the comments here, even if I would certainly be glad for your feedback if you did, but just wanted to ask your opinion about this specific question of the relative priority of the different environment variables)?

utelle commented 1 year ago

If LC_ALL is set to a valid value, for example LC_ALL=de_DE on my system, wxUILocale::GetSystemLanguage() returns the correct value. US English is returned only for non-existing locales.

I found out why this didn't work for me: this function actually uses LANGUAGE as primary source of information, so if it's set (and it is, to en_US:en value, in my case, as this is its value in /etc/default/locale on my Debian Bookworm system), then changing LC_ALL or LANG doesn't affect it at all.

Well, wxUILocale::GetSystemLanguage() calls wxUILocaleImpl::GetPreferredUILanguages() to determine a list of preferred UI languages. And the latter method indeed just looks at the environmanet variable LANGUAGE, but this is on purpose, because only LANGUAGE allows to specify a list of preferred languages, while LC_ALL, LC_MESSAGES, and LANG can be used to set a single language/locale only. Actually, the main purpose of wxUILocaleImpl::GetPreferredUILanguages() is to support wxTranslations::GetBestTranslation() in determining the best translation language.

BTW, wxUILocale::GetSystemLanguage() calls wxUILocale::GetSystemLocale() if LANGUAGE is not set. And wxUILocale::GetSystemLocale() uses LC_ALL, LC_MESSAGES, and LANG to determine the locale.

I'm not sure if this is the right thing to do, even if GNU manual says that it is,

I used exactly that source of information.

because, according to the same manual, LANGUAGE is used to specify the fallback translations and its first element is supposed to be the same as LANG.

So, one could say if LANGUAGE doesn't list LANG as its first element, there is a system misconfiguration. 😉

So I believe that a more logical and expected behaviour would be to take the first language from LANG and/or LC_ALL and then append the one from LANGUAGE to it (discarding duplicates).

Sure, we can do this. It should not do any harm, because the first element of LANGUAGE is supposed to be the same as LANG. However, the GNU manual states "GNU gettext gives preference to LANGUAGE over LC_ALL and LANG for the purpose of message handling ...". Therefore I'm not so sure whether it would really be expected behaviour to prioritize LC_ALL or LANG in the context of UI translations, if the first entry of LANGUAGE is not equal to LC_ALL/LANG.

The GNU manual clearly states that LC_ALL overwrites LC_MESSAGES which in turn overwrites LANG. The implementation in wxWidgets uses exactly this logic. And that was the case also in prior versions.

Conclusion: we add LC_ALL, LC_MESSAGES, resp LANG as an additional source of language information inwxUILocaleImpl::GetPreferredUILanguages().

However, AFAICT this change does not solve the issues with setting the locale from wxLocale or wxUILocale. Setting the locale will still fail for invalid or unkown locales. At least under Linux system using wxUILocale::UseDefault() will not work either, because the code uses newlocale to create a locale - and that will also fail for invalid or unknown locales.

vadz commented 1 year ago

Just to be clear, the issue with LANGUAGE is indeed a separate one and should probably be extended into another issue, so I'll do this. I've only mentioned it here to explain why I was seeing different behaviour from what you saw.

utelle commented 1 year ago

AFAICS the remaining question is what wxUILocale::GetSystemLocale() should return on Linux systems, if the default locale as set by LC_ALL, LC_MESSAGES, or LANG can't be created (that is, setlocale or newlocale fails). The current code assumes that the default locale can always be created, but that is obviously not true.

So, I guess we need to adjust method wxUILocale::GetSystemLocale() accordingly, so that the values of LC_ALL, LC_MESSAGES, and LANG are inspected, regardless of whether the locale can actually be set or not. For an invalid locale like xx_YY the method would return wxLANGUAGE_UNKNOWN, of course, but for unknown locales like fr_DE the method would return wxLANGUAGE_FRENCH.

utelle commented 1 year ago

IMHO for platforms, that do not use the environment variables LANGUAGE, LC_ALL, LC_MESSAGES, and LANG for specifying the default language/locale, the current implementation of wxUILocale::GetSystemLocale() is the correct approach:

/*static*/
int wxUILocale::GetSystemLocale()
{
    // Create default wxUILocale
    wxUILocale defaultLocale(wxUILocaleImpl::CreateUserDefault());

    // Find corresponding wxLanguageInfo
    const wxLanguageInfo* defaultLanguage = wxUILocale::FindLanguageInfo(defaultLocale.GetLocaleId());
    return defaultLanguage ? defaultLanguage->Language : wxLANGUAGE_UNKNOWN;
}

For platforms, that do use the environment variables LANGUAGE, LC_ALL, LC_MESSAGES, and LANG for specifying the default language/locale, I would change the implementation of wxUILocale::GetSystemLocale() as follows (but look at the comment at the beginning):

/*static*/
int wxUILocale::GetSystemLocale()
{
// Not sure which symbols should be used to identify wx ports
// that use LANGUAGE, LC_ALL, LC_MESSAGES, LANG
// __WXGTK__ may not be enough. 
#if defined(__WXGTK__)
    const wxLanguageInfo* defaultLanguage = nullptr;
    wxVector<wxString> preferred = wxUILocale::GetPreferredUILanguages();
    if (!preferred.empty())
    {
        defaultLanguage = wxUILocale::FindLanguageInfo(preferred[0]);
    }
#else 
    // Create default wxUILocale
    wxUILocale defaultLocale(wxUILocaleImpl::CreateUserDefault());

    // Find corresponding wxLanguageInfo
    const wxLanguageInfo* defaultLanguage = wxUILocale::FindLanguageInfo(defaultLocale.GetLocaleId());
#endif
    return defaultLanguage ? defaultLanguage->Language : wxLANGUAGE_UNKNOWN;
}

I'm not happy with using #ifdefs in this method, but since wxUILocaleImpl::CreateUserDefault() can fail to instantiate the default locale internally and will then fallback to C locale without reporting this, I see no other way without causing major efforts.

Still a problem remains. wxUILocale::UseDefault() will not fail, even for an unknown or invalid default locale, because this situation is not detected. wxUILocaleImpl::CreateUserDefault() creates an wxUILocaleImpl instance (with m_locale=nullptr due to failing newlocale) and method wxUILocaleImpl::Use() has no means to report that setting the default locale failed.

vslavik commented 1 year ago

@vslavik Would you know more about [...] this specific question of the relative priority of the different environment variables)? ... So I believe that a more logical and expected behaviour would be to take the first language from LANG and/or LC_ALL and then append the one from LANGUAGE to it (discarding duplicates).

I don't have knowledge, only opinion, from the same quoted source. I'm inclined to agree that LANG not being first in LANGUAGE is a misconfiguration, but if it happens in the wild, then yes, the more reasonable behavior of GetPreferredUILanguages() seems to me to be what you describe, i.e. put LANG first, then LANGUAGE.

(Note that LANG and LC_ALL are single-locale, not lists.)

vadz commented 1 year ago

I'd like to return to this and start by formulating the problems that I see:

  1. Under Mac, wxUILocale::GetCurrent().GetLocaleId() returns nothing initially. This is true even for the systems using the boring en-US locale. For the record, under both Linux and MSW this returns "C" initially. There is probably not that much difference between empty and "C", but it seems like it would better to be consistent. And while returning empty locale if none has been set actually does make sense, I think it's going to be more difficult to implement this semantics for the other platforms than returning "C" by default under Mac, so I'm going to do this (unless there are any strong objections).
  2. Under Mac, wxLocale::Init(wxLANGUAGE_DEFAULT) fails in en-FR locale. It "succeeds" under the other platforms, but the funny thing is that it works correctly, i.e. actually uses comma as decimal separator, under MSW and under Mac too, but does not switch to using French decimal separator under Linux in spite of not returning an error there.
  3. Under Mac, wxUILocale::GetSystemLocale() returns wxLANGUAGE_UNKNOWN in en-FR locale. Under MSW it returns wxLANGUAGE_FRENCH, which is not obvious correct neither, but much better. Under Linux it returns wxLANGUAGE_ENGLISH but this is just because it uses the default "C" locale there. Note that wxUILocale::GetSystemLanguage() does work correctly everywhere (but under Linux it's again just a coincidence).

So, AFAICS, MSW behaviour is close to ideal, with the only problem being that it seems like it would be better to return en-FR from GetTag() for the current locale when it's using English language with French regional settings, but I'm not even sure about this and won't change this for now.

Linux behaviour is consistently bad and we just don't support any "mixed" locales (neither fr_US nor en_FR) unless they're actually supported by the system, which doesn't seem to ever be the case. I think we need to improve this, but I'm not sure how exactly yet. I'm going to open another issue about this.

Mac behaviour is almost good, actually, but not quite because of the problems listed above. I'm going to fix those because the way to do it seems straightforward, i.e. I will make it return "C" initially, recognize "unknown" system locale using the "Region" setting of the "Settings" app and return true from wxLocale::Init(wxLANGUAGE_DEFAULT). Please let me know if anyone disagrees.

vadz commented 1 year ago

As always, more problems arose when trying to actually do it: it seems like we just don't have enough information to implement this correctly currently, as we need to be able to somehow map the region to the language. It can be done in some cases, e.g. for en-FR or fr-DE it's indeed simple to return wxLANGUAGE_FRENCH or GERMAN. But what about ru-BE, which is what we get from both macOS and Windows 10 when configuring them to use Russian language with Belgium locale? Using the same logic as above would result in returning wxLANGUAGE_BELARUSIAN which is definitely not what we want. And there are several locales using "BE" as region (nl-BE, fr-BE, de-BE and even en-BE).

It seems like we have no choice but to return wxLANGUAGE_UNKNOWN in this case. And this means we need to provide some new static wxUILocale::GetSystemLocaleIdent(), which would return the full information.

utelle commented 1 year ago

I'd like to return to this and start by formulating the problems that I see:

  1. Under Mac, wxUILocale::GetCurrent().GetLocaleId() returns nothing initially. This is true even for the systems using the boring en-US locale. For the record, under both Linux and MSW this returns "C" initially. There is probably not that much difference between empty and "C", but it seems like it would better to be consistent. And while returning empty locale if none has been set actually does make sense, I think it's going to be more difficult to implement this semantics for the other platforms than returning "C" by default under Mac, so I'm going to do this (unless there are any strong objections).

If no locale has been set, a call to wxUILocale::GetCurrent() tries to create a standard C locale using the locale identifier wxLocaleIdent().Language("C"). Under Linux "C" is a valid locale identifier. Under Windows we have a special class wxUILocaleImplStdC for this purpose.

The problem is that under Mac the "C" locale is not included in the list of availableLocaleIdentifiers, and therefore creation fails and a null pointer is returned. And therefore methods GetLocaleId() or GetName() can't return anything else than an empty locale id or an empty string.

I have no idea whether it is somehow possible to create NSLocale object for standard C. Maybe the best option is to create a locale for en_US in that case. Alternatively, we could use the Windows approach by creating a special class wxUILocaleImplStdC.

  1. Under Mac, wxLocale::Init(wxLANGUAGE_DEFAULT) fails in en-FR locale. It "succeeds" under the other platforms, but the funny thing is that it works correctly, i.e. actually uses comma as decimal separator, under MSW and under Mac too, but does not switch to using French decimal separator under Linux in spite of not returning an error there.

I guess that under Linux the call to setlocalesimply fails for the (unknown) locale en_FR and that setlocale fails is not checked in the code. Therefore still the standard C locale will be used.

  1. Under Mac, wxUILocale::GetSystemLocale() returns wxLANGUAGE_UNKNOWN in en-FR locale. Under MSW it returns wxLANGUAGE_FRENCH, which is not obvious correct neither, but much better. Under Linux it returns wxLANGUAGE_ENGLISH but this is just because it uses the default "C" locale there. Note that wxUILocale::GetSystemLanguage() does work correctly everywhere (but under Linux it's again just a coincidence).

Well, wxUILocale::GetSystemLanguage() checks the preferred UI language. And my PR #23147 should improve the Linux implementation.

wxUILocale::GetSystemLocale() on the other hand tries to create the default system locale and tries to determine the language/locale from this source and to look it up in the wx language database.

It is surprising thatyou get wxLANGUAGE_FRENCH for en-FR locale under MSW. Have you checked which locale is actually instantiated as the default locale?

My proposal given in a previous post should improve the Linux implementation.

So, AFAICS, MSW behaviour is close to ideal, with the only problem being that it seems like it would be better to return en-FR from GetTag() for the current locale when it's using English language with French regional settings, but I'm not even sure about this and won't change this for now.

I will have to test the situation under MSW.

Linux behaviour is consistently bad and we just don't support any "mixed" locales (neither fr_US nor en_FR) unless they're actually supported by the system, which doesn't seem to ever be the case. I think we need to improve this, but I'm not sure how exactly yet. I'm going to open another issue about this.

Under Linux the function setlocale is used to actually set the locale. If the system does not support the requested "mixed" locale there is not much we can do about it, because setlocale will fail in that case.

Mac behaviour is almost good, actually, but not quite because of the problems listed above. I'm going to fix those because the way to do it seems straightforward, i.e. I will make it return "C" initially, recognize "unknown" system locale using the "Region" setting of the "Settings" app and return true from wxLocale::Init(wxLANGUAGE_DEFAULT). Please let me know if anyone disagrees.

As mentioned above creating a special locale implementation for standard C (as under MSW) might be the easiest way to fix this.

utelle commented 1 year ago

As always, more problems arose when trying to actually do it: it seems like we just don't have enough information to implement this correctly currently, as we need to be able to somehow map the region to the language.

Mapping region to language uniquely in general is simply impossible. For example, take Belgium. Which language would you choose? Dutch (nl-BE) or French (fr-BE)? Similar for the Netherlands: Dutch (nl-NL) or Frisian (fy-NL)? Even worse for Switzerland: German (de-CH), French (fr-CH), or Italian (it-CH)?

It can be done in some cases, e.g. for en-FR or fr-DE it's indeed simple to return wxLANGUAGE_FRENCH or GERMAN. But what about ru-BE, which is what we get from both macOS and Windows 10 when configuring them to use Russian language with Belgium locale? Using the same logic as above would result in returning wxLANGUAGE_BELARUSIAN which is definitely not what we want. And there are several locales using "BE" as region (nl-BE, fr-BE, de-BE and even en-BE).

You can't use the region code as the language code. If we really need a mapping from region to language, we will have to set up a mapping ourselves (as we did for languages/locales). However, IMHO we shouldn't do that. If the system supports "mixed" locales we should leave it to the system to handle this. Windows for example will try to handle a locale like "fr-US", although it is not a known locale.

Under Linux "mixed" locales are usually not supported. If the user does not set a supported locale as the default, it is a misconfiguration.

"Mixed" locales make actually only sense, if you want to use formatting of numbers, dates and so on according to the region where you are located, but the UI should use a language you are used to. For example, the UI should be in German (de-DE), but the formatting as in the US (en-US).

Under Linux the environment variables LC_*, LANG, and LANGUAGE can be used to configure the settings to your liking. However, it will only work properly if known values are used.

It seems like we have no choice but to return wxLANGUAGE_UNKNOWN in this case. And this means we need to provide some new static wxUILocale::GetSystemLocaleIdent(), which would return the full information.

Hm, the methods wxUILocale::GetSystemLanguage() and wxUILocale::GetSystemLocale() return an enum wxLanguage value. That is, they will work only for "known" locales (included in our language database). The methods exist only to support wxLocale. wxUILocale has method wxUILocale::GetLocaleId() etc to retrieve the full information.

vadz commented 1 year ago

I think we mostly agree about everything here, I'd just like to say that I couldn't finish this today but I've a WIP branch with the fixes for the behaviour under Mac that I hope to finish tomorrow or, at worst, on Monday.

utelle commented 1 year ago

I think we mostly agree about everything here, I'd just like to say that I couldn't finish this today but I've a WIP branch with the fixes for the behaviour under Mac that I hope to finish tomorrow or, at worst, on Monday.

Take your time. Unfortunately, this issue is all but trivial.

However, I have to get back to the situation under Windows:

So, AFAICS, MSW behaviour is close to ideal, with the only problem being that it seems like it would be better to return en-FR from GetTag() for the current locale when it's using English language with French regional settings, but I'm not even sure about this and won't change this for now.

Actually, wxUILocale::IsSupported() returns false for the tag en-FR. This can be seen in the internat sample, using the test schema menu option. What I would like to know is which settings you used for your tests under Windows.

In the systems settings of Windows 10 you have a menu entry Language where you can set the Windows Display Language. There you can also add additional languages. However, adding English (France) is impossible, since this combination is not listed.

Under the Region menu entry there are 2 places where you can change settings:

  1. Region: here you can select a geographical region like France
  2. Regional formatting: here you can select a locale like French (France). However, English (France) is not among the choices.

So, which settings did you actually use for your tests?

vadz commented 1 year ago

Under Windows you have to install the appropriate language to be able to choose its region. I.e. my system language is English, but I also have French (and a few others, including Chinese...) in the list of languages and then I can choose between all the countries/regions using English or French or others in the "Regional formatting" dropdown.

utelle commented 1 year ago

Under Windows you have to install the appropriate language to be able to choose its region.

Yes, I am aware of that.

I.e. my system language is English, but I also have French (and a few others, including Chinese...) in the list of languages

My list of languages is certainly a bit shorter than yours, but I have German, English, and French installed (German being my system language).

and then I can choose between all the countries/regions using English or French or others in the "Regional formatting" dropdown.

Your dropdown for "Regional formatting" really contains all language/region combinations? On my system that is definitely not the case. For example, for German my dropdown contains combinations with Belgium, Germany, Italy, Liechtenstein, Luxembourg, Austria, and Switzerland. I can't choose German (France) or German (United States), for example. Similar for English or French, although the lists of supported regions are longer, but do not include English (France) or French (United States). English (Germany) is included, however.

vadz commented 1 year ago

Sorry, I misunderstood. It doesn't contain "English (France)", only "France (France)", see

region_combobox

But this is the region setting, so it only changes the region/locale, not the language. Hence it's the same situation as under Mac with choosing English as the language and France as the region.

The difference is that Mac actually uses en-FR locale in this case while MSW uses fr-FR locale -- but "en" as the preferred language.

vadz commented 1 year ago

The problems mentioned in this comment are fixed, at least as well as possible (GetSystemLocale() still returns wxLANGUAGE_UNKNOWN for mixed locales, but we really can't do much else in it, but at least you can use the new GetSystemLocale() to get this information now).

The second one of these problems is actually what the original bug was about, i.e. using wxLocale::Init(wxLANGUAGE_DEFAULT) on a system using en-DE locale now doesn't fail any longer and actually does the right thing, i.e. wxUILocale::GetInfo() (and all code depending on it, such as wxNumberFormatter) should return the correct values.

So, AFAICS, there doesn't remain anything else to be done here for Mac. There are still some potential improvements to be done for Linux, but there are other issues for them.

Please test the changes of PR #23226 if you can because I'd also like to include it in the upcoming 3.2.2 release. TIA!

utelle commented 1 year ago

Sorry, I misunderstood. It doesn't contain "English (France)", only "France (France)", see

But this is the region setting, so it only changes the region/locale, not the language. Hence it's the same situation as under Mac with choosing English as the language and France as the region.

Ok, so under Windows your setting is "UI language = English (US)" and "Regional formatting = French (France)". Therefore wxUILocale::GetSystemLanguage() will return wxLANGUAGE_ENGLISH_US, while wxUILocale::GetSystemLocale() will return wxLANGUAGE_FRENCH_FRANCE. What else could we do?

The problem with this approach is that the calendar control in the internat sample will show French month and day names, while the rest of the UI is in English.

BTW, the generic calendar control does not translate anything, but uses what it gets from the underlying date formatting functions.

Under Linux there is typically no support for "mixed" locales. "en_FR" is unknown. To achieve the wanted effect, one could define LC_ALL=fr_FR and LANGUAGE=en_US.

The difference is that Mac actually uses en-FR locale in this case while MSW uses fr-FR locale -- but "en" as the preferred language.

On my system I saw en-DE, when I changed the language to English. However, AFAICR testing for availability of locale en-DE returned false.

So, AFAICS, there doesn't remain anything else to be done here for Mac. There are still some potential improvements to be done for Linux, but there are other issues for them.

Please test the changes of PR #23226 if you can because I'd also like to include it in the upcoming 3.2.2 release. TIA!

I'll try to test the changes until tomorrow evening.

utelle commented 1 year ago

In my previous post I wrote:

Under Linux there is typically no support for "mixed" locales. "en_FR" is unknown. To achieve the wanted effect, one could define LC_ALL=fr_FR and LANGUAGE=en_US.

Unfortunately, with our applied modifications to wxUILocaleImpl::GetPreferredUILanguages() this will not work, because LC_ALL will now take priority over LANGUAGE.

vadz commented 1 year ago

The problem originally reported in this issue has now been fixed in both master and 3.2. The new test used to report

-------------------------------------------------------------------------------
wxUILocale::ShowSystem
-------------------------------------------------------------------------------
tests/intl/intltest.cpp:480
...............................................................................

tests/intl/intltest.cpp:485:
warning:
  System locale:        UNKNOWN
  System language:      English

tests/intl/intltest.cpp:487:
warning:
  Before calling any locale functions
  current locale:       NONE (decimal separator: point)

tests/intl/intltest.cpp:490: FAILED:
  CHECK( locDef.Init(wxLANGUAGE_DEFAULT, wxLOCALE_DONT_LOAD_DEFAULT) )
with expansion:
  false

tests/intl/intltest.cpp:491:
warning:
  After wxLocale::Init(wxLANGUAGE_DEFAULT)
  current locale:       en_FR (decimal separator: comma)

tests/intl/intltest.cpp:494:
warning:
  After wxUILocale::UseDefault()
  current locale:       en_FR (decimal separator: comma)

tests/intl/intltest.cpp:504:
warning:
  Preferred UI languages:
  en-FR, ru-FR, fr-FR

on a system configured to use English language in French region but now reports

-------------------------------------------------------------------------------
wxUILocale::ShowSystem
-------------------------------------------------------------------------------
tests/intl/intltest.cpp:481
...............................................................................

tests/intl/intltest.cpp:488:
warning:
  System locale identifier:     en-FR
  System locale as language:    UNKNOWN
  System language identifier:   English

tests/intl/intltest.cpp:490:
warning:
  Before calling any locale functions
  current locale:       C (decimal separator: point)

tests/intl/intltest.cpp:494: warning:
  After wxLocale::Init(wxLANGUAGE_DEFAULT)
  current locale:       en_FR (decimal separator: comma)

tests/intl/intltest.cpp:497: warning:
  After wxUILocale::UseDefault()
  current locale:       en_FR (decimal separator: comma)

tests/intl/intltest.cpp:507: warning:
  Preferred UI languages:
  en-FR, ru-FR, fr-FR

i.e. wxLocale::Init(wxLANGUAGE_DEFAULT) now succeeds and, although wxUILocale::GetSystemLocale() still returns wxLANGUAGE_UNKNOWN, there is now a new wxUILocale::GetSystemLocaleId() function which returns "en-FR" string describing the locale.

Additionally, there have been a number of fixes for Linux/other Unix systems too.

In any case, please test and reopen this one if you still see anything wrong under Mac.

TIA!