niXman / mingw-builds

Scripts for building the 32 and 64-bit MinGW-W64 compilers for Windows
Other
290 stars 107 forks source link

localization of GCC #666

Closed anbangli closed 1 month ago

anbangli commented 9 months ago

I use both Windows and Linux with UI language set as Chinese. I noticed that GCC's output information is Chinese on Linux, but is English on Windows.

Is the difference came from localization options in building?

I hope GCC's output information is Chinese on Windows too.

niXman commented 9 months ago

I think it is because of using --disable-nls which is used by default for configuring GCC: https://github.com/niXman/mingw-builds/blob/develop/scripts/gcc-13.2.0.sh#L122

from the man:

--enable-nls
--disable-nls
The --enable-nls option enables Native Language Support (NLS), which lets GCC output diagnostics in languages other than American English. Native Language Support is enabled by default if not doing a canadian cross build. The --disable-nls option disables NLS.

Note that this functionality requires either libintl (provided by GNU gettext) or C standard library that contains support for gettext (such as the GNU C Library). See [–with-included-gettext](https://gcc.gnu.org/install/configure.html#with-included-gettext) for more information on the conditions required to get gettext support.
yuanpeirong commented 9 months ago

I did the following tests and all failed: 1、file:"gcc-13.2.0.sh", replace "--disable-nls" with "--enable-nls". (https://github.com/niXman/mingw-builds/commit/dd9a546a3d595c26fcd73610a600ace897d02015)

2、file:"gcc-13.2.0.sh", add: --with-libintl-prefix=/usr/lib --with-libintl-type=autostaticshared --with-included-gettext (https://github.com/niXman/mingw-builds/commit/247aa04640049b7c50efdedf5f20eb9ab60e4d0f)

3、revise "gcc\intl.cc" in "gcc-13.2.0.tar.xz" and replace "--disable-nls" with "--enable-nls" in all file. (https://github.com/niXman/mingw-builds/commit/2ecf0108ca70f60375556e1c73a67281d7ff53a5)

the intl.cc

//----------yuanpeirong-add----------
#ifdef WIN32
#include <windows.h>

BOOL DirectoryExists(LPSTR lpszPath)
{
    WIN32_FIND_DATA wfd;
    BOOL bResult = FALSE;
    HANDLE hFind = FindFirstFile(lpszPath, &wfd);
    if ((hFind != INVALID_HANDLE_VALUE) && (wfd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
    {
        bResult = TRUE; 
    }
    FindClose(hFind);
    return bResult;
}
#endif
//----------yuanpeirong-add----------

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "intl.h"

#ifdef HAVE_LANGINFO_CODESET
#include <langinfo.h>
#endif

/* Opening quotation mark for diagnostics.  */
const char *open_quote = "'";

/* Closing quotation mark for diagnostics.  */
const char *close_quote = "'";

/* The name of the locale encoding.  */
const char *locale_encoding = NULL;

/* Whether the locale is using UTF-8.  */
bool locale_utf8 = false;

#ifdef ENABLE_NLS

/* Initialize the translation library for GCC.  This performs the
   appropriate sequence of calls - setlocale, bindtextdomain,
   textdomain.  LC_CTYPE determines the character set used by the
   terminal, so it has be set to output messages correctly.  */

void
gcc_init_libintl (void)
{
#ifdef HAVE_LC_MESSAGES
  setlocale (LC_CTYPE, "");
  setlocale (LC_MESSAGES, "");
#else
  setlocale (LC_ALL, "");
#endif

  //----------yuanpeirong-del----------
  //(void) bindtextdomain ("gcc", LOCALEDIR);
  //----------yuanpeirong-del----------

    //----------yuanpeirong-add----------
    #ifndef WIN32
        (void) bindtextdomain ("gcc", LOCALEDIR);
    #else
        DWORD dwSize = MAX_PATH + 20;
        LPSTR lpszName = (LPSTR) xmalloc(dwSize);
        DWORD dwRealSize = GetModuleFileNameA(NULL, lpszName, dwSize) + 1;
        if (dwRealSize > dwSize)
        {
          lpszName = (LPSTR) xrealloc(lpszName, dwSize + 20);
          GetModuleFileNameA(NULL, lpszName, dwRealSize + 20);
        }

        int l = strlen(lpszName);
        while (l >= 0)
        {
          if (lpszName[l] == '\\')
              break;
          l--;
        }
        lpszName[l] = 0;

        l = strlen(lpszName);
        while (l > 0)
        {
          if (lpszName[l] == '\\')
              break;
          l--;
        }

        if (lpszName[l] != '\\')
            (void) bindtextdomain ("gcc", LOCALEDIR);
        else
        {
            lpszName[l + 1] = 0;
            strcat(lpszName, "share\\locale");
            if (DirectoryExists(lpszName))
              (void) bindtextdomain ("gcc", lpszName);
            else
              (void) bindtextdomain ("gcc", LOCALEDIR);
        }

        free(lpszName);
    #endif
    //----------yuanpeirong-add----------

  (void) textdomain ("gcc");

  /* Opening quotation mark.  */
  open_quote = _("`");

  /* Closing quotation mark.  */
  close_quote = _("'");

#if defined HAVE_LANGINFO_CODESET
  locale_encoding = nl_langinfo (CODESET);
  if (locale_encoding != NULL
      && (!strcasecmp (locale_encoding, "utf-8")
      || !strcasecmp (locale_encoding, "utf8")))
    locale_utf8 = true;
#endif

  if (!strcmp (open_quote, "`") && !strcmp (close_quote, "'"))
    {
      /* Untranslated quotes that it may be possible to replace with
     U+2018 and U+2019; but otherwise use "'" instead of "`" as
     opening quote.  */
      open_quote = "'";
#if defined HAVE_LANGINFO_CODESET
      if (locale_utf8)
    {
      open_quote = "\xe2\x80\x98";
      close_quote = "\xe2\x80\x99";
    }
#endif
    }
}

#if defined HAVE_WCHAR_H && defined HAVE_WORKING_MBSTOWCS && defined HAVE_WCSWIDTH
#include <wchar.h>

/* Returns the width in columns of MSGSTR, which came from gettext.
   This is for indenting subsequent output.  */

size_t
gcc_gettext_width (const char *msgstr)
{
  size_t nwcs = mbstowcs (0, msgstr, 0);
  wchar_t *wmsgstr = XALLOCAVEC (wchar_t, nwcs + 1);

  mbstowcs (wmsgstr, msgstr, nwcs + 1);
  return wcswidth (wmsgstr, nwcs);
}

#else  /* no wcswidth */

/* We don't have any way of knowing how wide the string is.  Guess
   the length of the string.  */

size_t
gcc_gettext_width (const char *msgstr)
{
  return strlen (msgstr);
}

#endif

#endif /* ENABLE_NLS */

#ifndef ENABLE_NLS

const char *
fake_ngettext (const char *singular, const char *plural, unsigned long n)
{
  if (n == 1UL)
    return singular;

  return plural;
}

#endif

/* Return the indent for successive lines, using the width of
   the STR.  STR must have been translated already.  The string
   must be freed by the caller.  */

char *
get_spaces (const char *str)
{
   size_t len = gcc_gettext_width (str);
   char *spaces = XNEWVEC (char, len + 1);
   memset (spaces, ' ', len);
   spaces[len] = '\0';
   return spaces;
}
niXman commented 9 months ago

@yuanpeirong could you please just replace --disable-nls with --enable-nls here: https://github.com/niXman/mingw-builds/blob/develop/scripts/gcc-13.2.0.sh#L122 and rebuild the 13.2.0, and report the result back?

yuanpeirong commented 9 months ago

my Test1 is this:file:"gcc-13.2.0.sh", replace "--disable-nls" with "--enable-nls". https://github.com/niXman/mingw-builds/commit/dd9a546a3d595c26fcd73610a600ace897d02015

but it is failed.

niXman commented 9 months ago

but it is failed.

what does this mean? please provide the error report.

yuanpeirong commented 9 months ago

For example,this is hello.cpp file

//hello.cpp
#include <iostream>
using namespace std;
int main(){
    cout1 << "Hello World." << endl;
    return 0;
}

In Linux:g++ hello.cpp -o hello It will output with my UI language set as Chinese,like this 错误:‘cout1’在此作用域中尚未声明

but in Windows: I just replace --disable-nls with --enable-nls here: https://github.com/niXman/mingw-builds/blob/develop/scripts/gcc-13.2.0.sh#L122 and rebuild the 13.2.0 I also g++ hello.cpp -o hello.exe It is not will output in my UI language set as Chinese. It is also output error: 'cout1' was not declared in this scope

niXman commented 9 months ago

@yuanpeirong It's not clear from your first message that the compiler was built successfully =)

then you need to inspect the GCC configuration log file. and, also please put it here.

yuanpeirong commented 9 months ago

this is the file:http://www.yuanpeirong.com/gcc/x86_64-13.2.0-release-posix-seh-ucrt-rt_v11-rev0.7z I just replace --disable-nls with --enable-nls here: https://github.com/niXman/mingw-builds/blob/develop/scripts/gcc-13.2.0.sh#L122 and rebuild the 13.2.0

niXman commented 9 months ago

ok, then please provide the GCC configuration log file.

yuanpeirong commented 9 months ago

Please check it out: build-info.txt Thank you.

niXman commented 9 months ago

this is not configuration log, but the configuration RESUME file. configuration log file is what is generated by the configure configuration script, which is part of the GCC source code.

according to the provided RESUME file, I think it should be placed in /c/buildroot/gcc/... please provide me the list of entries in the /c/buildroot directory.

yuanpeirong commented 9 months ago

it is build in Github Actions https://github.com/yuanpeirong/mingw-builds/actions/runs/7384217528/job/20086722002 x86_64 _posix_seh ucrt.txt

niXman commented 9 months ago

unfortunately I don't know how to view the required file on Github Actions...

yuanpeirong commented 9 months ago

could you help me to rebuild it ?

niXman commented 9 months ago

please read the README: https://github.com/niXman/mingw-builds/blob/develop/README.md

it looks actual.

anbangli commented 9 months ago

We have a build with --enable-nls . On Windows 7 of Chinese version, we got the following results:

Running command "gcc" or "gcc -v" or "gcc --help", the output message is in Chinese now. For example: d:\mingw-zh_cn\mingw64>gcc gcc: 致命错误:没有输入文件

Compiling source file, the output message is still in English. For example: d:\mingw-zh_cn\mingw64>gcc hello.cpp hello.cpp: In function 'int main()': hello.cpp:7:9: error: 'cout1' was not declared in this scope 7 | cout1 << "Hello, World!" << endl; | ^~~~~

What we need to do in order to make the output compiling message in Chinese?

niXman commented 9 months ago

hmm... I will ask on GCC mailing list...

niXman commented 9 months ago

done: https://gcc.gnu.org/pipermail/gcc-help/2024-January/143169.html

niXman commented 9 months ago

@anbangli

please read: https://gcc.gnu.org/pipermail/gcc-help/2024-January/143170.html

please comment the questions at the link.

anbangli commented 9 months ago

I have just read LIU Hao's comment in https://gcc.gnu.org/pipermail/gcc-help/2024-January/143172.html . I guess that we have got the reason as following.

For command "gcc -v" and "gcc --help", the message is outputed by program bin/gcc.exe. According to relative path "../share/locales/zh_CN/LC_MESSAGES/", the program can find zh_CN.po correctly.

But in compiling, the compiling message is outputed by program libexec\gcc\x86_64-w64-mingw32\11.2.0\cc1plus.exe, the program can not correctly find zh_CN.po according to relative path "../share/locales/zh_CN/LC_MESSAGES/".

lhmouse commented 9 months ago

But in compiling, the compiling message is outputed by program libexec\gcc\x86_64-w64-mingw32\11.2.0\cc1plus.exe, the program can not correctly find zh_CN.po according to relative path "../share/locales/zh_CN/LC_MESSAGES/".

That was only a presumption. Please try copying the locale directory so it may be found, and see whether it solves the issue for you.

anbangli commented 9 months ago

That was only a presumption. Please try copying the locale directory so it may be found, and see whether it solves the issue for you.

Good advice. I copied directory share/locales/zh_CN/LC_MESSAGES/ to
libexec/gcc/x86_64-w64-mingw32/11.2.0/share/locales/zh_CN/LC_MESSAGES/
, and got error message in Chinese:

d:\mingw-zh_cn\mingw64>gcc hello.cpp hello.cpp: In function ‘int main()’: hello.cpp:7:9: 错误:‘cout1’在此作用域中尚未声明 7 | cout1 << "Hello, World!" << endl; | ^~~~~

So, my presumption is confirmed.

lhmouse commented 9 months ago

I believe this has something to do with https://github.com/gcc-mirror/gcc/blob/4d31d6606201b339825c370c2e1969b2dcd17f39/libcpp/init.cc#L186.

It is practically bad to supply a relative path (second argument) to bindtextdomain(); maybe it is possible to get the absolute path of the current executable with GetModuleFileNameW(), then replace \lib\gcc\... up to the end with the locale dir, then feed the result to wbindtextdomain().

anbangli commented 9 months ago

I have got an article in which fixed the problem: https://blog.csdn.net/hackpascal/article/details/15222083

@lhmouse , would you please merge the upper source code to GCC?

lhmouse commented 9 months ago

That code needs polishing a bit. For example, wide-char functions should be preferred to ANSI ones.

niXman commented 7 months ago

@anbangli @lhmouse guys, any progress on it?

lhmouse commented 7 months ago

.. almost forgot this one. This is gonna be a bit complex and requires some time to look into.

lhmouse commented 7 months ago

If copying the locale directory works, I think it will be the safest solution.

While it is possible to find the locale directory by replacing the executable name in <prefix>/lib/gcc/x86_64-w64-mingw32/11.2.0/cc1plus.exe with ../../../../share/locales, as the code is shared by all gcc executables, for gcc.exe it would run out of <prefix>. It's like accessing past the end of a buffer, which does not necessarily cause a crash, but it's not good anyway.