python / cpython

The Python programming language
https://www.python.org
Other
63.49k stars 30.4k forks source link

Traceback Internationalization Proposal #60548

Closed 2134dcd1-7c52-4a75-af8d-a177d48a95ea closed 12 years ago

2134dcd1-7c52-4a75-af8d-a177d48a95ea commented 12 years ago
BPO 16344
Nosy @birkenfeld, @rhettinger, @terryjreedy, @pitrou, @ezio-melotti, @stevendaprano, @bitdancer
Files
  • traceback_internationalization_proposal.patch: initial patch to add Py_GETTEXT i18n (EXPERIMENTAL)
  • pep_i18n_traceback.txt: PEP draft for traceback internationalization proposal
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['interpreter-core', 'type-feature', 'expert-unicode'] title = 'Traceback Internationalization Proposal' updated_at = user = 'https://bugs.python.org/reingart' ``` bugs.python.org fields: ```python activity = actor = 'reingart' assignee = 'none' closed = True closed_date = closer = 'rhettinger' components = ['Interpreter Core', 'Unicode'] creation = creator = 'reingart' dependencies = [] files = ['27756', '27757'] hgrepos = [] issue_num = 16344 keywords = ['patch'] message_count = 30.0 messages = ['173998', '174005', '174007', '174008', '174013', '174081', '174084', '174092', '174116', '174209', '174212', '174215', '174232', '174243', '174257', '174298', '174343', '174345', '174350', '174351', '174353', '174354', '174355', '174361', '174364', '174389', '174390', '174391', '242541', '242543'] nosy_count = 10.0 nosy_names = ['georg.brandl', 'rhettinger', 'terry.reedy', 'pitrou', 'ezio.melotti', 'steven.daprano', 'r.david.murray', 'neologix', 'Ramchandra Apte', 'reingart'] pr_nums = [] priority = 'normal' resolution = 'rejected' stage = None status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue16344' versions = ['Python 3.4'] ```

    2134dcd1-7c52-4a75-af8d-a177d48a95ea commented 12 years ago

    I'm opening this ticket to organize patches for a proposal of a GETTEXT-based message translation for exception/tracebacks as described in:

    \http://python.org.ar/pyar/TracebackInternationalizationProposal\

    This requires the patch in issue bpo-16343

    Attached is a patch for a proof of concept, it includes:

    birkenfeld commented 12 years ago

    Has this been discussed on python-dev before? I see that your proposal is in PEP form; it would be a good idea to post it to python-dev since this is not a change that can be done without a PEP.

    2134dcd1-7c52-4a75-af8d-a177d48a95ea commented 12 years ago

    This has been discussed in python-ideas two years ago (I've resurrected the thread there)

    Sadly I didn't have time for this before, but as in 15 days we have a sprint on cpython at PyCon Argetina 2012, maybe it would be a good idea discuss this again.

    Sorry if I've made any mistake, this is my second patch here, and my C skills are rusty as I've mentioned in the other issue.

    2134dcd1-7c52-4a75-af8d-a177d48a95ea commented 12 years ago

    BTW, I'd write a draft PEP for this (attached), the online version is at \http://python.org.ar/pyar/TracebackInternationalizationProposal\

    Just let me know if it has to be uploaded/discussed elsewere

    ezio-melotti commented 12 years ago

    I'm -1 on the idea for the following reasons:

    2134dcd1-7c52-4a75-af8d-a177d48a95ea commented 12 years ago

    "serious" developers? sorry but I think that is a unfortunate phrase that goes against the Python Diversity Statement What about young pupil? What about non-programmers (i.e. accountants)? In some places (like my country, public schools), English is not teach formally until the University.

    And I don't think non-English speakers are just a subset of users. Come on, English is not even the top native tongue (that is Chinese Mandarin). English can be one of the most spoken languages, but even that metric only reach 1/7th of the total world population. Other languages like Spanish or Portuguese are also rising.

    http://en.wikipedia.org/wiki/Language#Linguistic_diversity

    BTW, as the draft says, Python is the offender here, as other error messages are already translated (including the OS ones, even inside Python!):

    C:\Python32>python
    Python 3.2.2 (default, Sep  4 2011, 09:51:08) [MSC v.1500 32 bit (Intel)] on win 32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os
    >>> os.listdir("J:\\")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    WindowsError: [Error 3] El sistema no puede encontrar la ruta especificada: 'J:\\*.*'

    PostgreSQL already translates error messages too:

    C:\Program Files\PostgreSQL\9.0\bin>psql psql (9.0.3) Digite «help» para obtener ayuda.

    Mariano=> SELECT FROM nowhere; ERROR: no existe la relación «nowhere» LÍNEA 1: SELECT FROM nowhere; ^

    And Bash too:

    reingart@desktop:~/cpython$ ls /nowhere ls: no se puede acceder a /nowhere: No existe el archivo o el directorio

    Of course, there is no need to translate keywords or libraries (as SQL sentences and bash command are not translated, just messages are), I don't see why this could cause confusion, instead that, I think python would become more consistent with other tools and thus more easy to use.

    The mechanism to restore the language is the common one (used by almost every other application that support i18n):
    >>> locale.setlocale(locale.LC_MESSAGES, "C")
    It should be not difficult for "serious" programmers to handle that :-)
    If that is a concern, it could be implemented a command line parameter, a environment variable or a shortcut in locale module.

    Anyway, people will not necessarily be faced by default with the localized version, an if for example, a teacher has to jump to an student machine, surely it could use it as messages will be probably in the spoken language of the country (BTW, probably most of the operating system components will be localized, not only Python) For advanced users or logging, it could be disabled at all!

    Finally, you're correct about that translation is not easy job, and this proposal (traceback internationalization) is just the tip of the iceberg (even more work will be needed in other aspects to get a full localization). If PostgreSQL and other tools could do that, why Python could not?

    pitrou commented 12 years ago

    I think the PEP should be proposed on python-dev or python-ideas. Also, it's probably better if the PEP is encoded in utf-8, not latin-1.

    terryjreedy commented 12 years ago

    I am sympathetic with non-English speakers wanting a native-language translation. But I think the interpreter should *always* emit the standard message and that any translation should be an addition, not a replacement. This would maintain discoverablity and help people learn the English version, not hinder it.

    The real question to me is how deep in the interpreter such support should go. Third party shells can (and sometimes do) intercept tracebacks and reformat (and translate) as they wish. But there would be advantages and disadvanteges is adding the translation sooner.

    ezio-melotti commented 12 years ago

    "serious" developers? sorry but I think that is a unfortunate phrase that goes against the Python Diversity Statement

    With "serious" I just mean anyone that wants to continue programming, as opposed as someone doing e.g. a one-off course at university (hence the quotes). The whole ecosystem around Python and most of the other programming languages is mostly in English, and anyone that doesn't know English will have to face many other problems later on (e.g. no localized documentation and blog posts).

    There are two solutions to this problem: 1) adapt the language to the users; 2) teach the users English;

    While the first (i.e. what you are proposing) works as a short term solution, I believe the second is a much better long term solution, because IMHO users will anyway have to learn English sooner or later.

    In some places (like my country, public schools), English is not teach formally until the University.

    This is very unfortunate -- I was under the impression that teaching English in middle/high schools was already common in most of the countries.

    And I don't think non-English speakers are just a subset of users.

    Do you mean people that aren't native English speakers or people who don't even grasp enough English to understand the error messages?

    BTW, as the draft says, Python is the offender here, as other error messages are already translated (including the OS ones, even inside Python!):

    This is another thing that I dislike, for the aforementioned reasons. I've seen buildbots reporting unintelligible error messages in German, and just a few days ago I even came across a mercurial version in Russian. It makes somewhat sense to translate OS error messages, because they are read by regular users that have a localized OS and expect localized messages. The same could be said for bash, even if the distinction between "regular users" and "developers" starts to fade a bit here.

    I don't see why this could cause confusion, instead that, I think python would become more consistent with other tools and thus more easy to use.

    For example the other day I saw a student confused by this error message:
    >>> a, b = 1, 2, 3
    ValueError: too many values to unpack (expected 2)

    The offender here is most likely the word "unpack". "Unpack" is closely related to the concept of tuple unpacking, so if the student is aware of what tuple unpacking is, he might fail to associate the problem with it if the error uses another word. In addition, I can not think of any word that might be a suitable translation for "unpack" in my native language. In Spanish "desempaquetar" could maybe be used, but I'm not sure how well it works.

    The mechanism to restore the language is the common one (used by almost every other application that support i18n): >>> locale.setlocale(locale.LC_MESSAGES, "C") It should be not difficult for "serious" programmers to handle that :-) If that is a concern, it could be implemented a command line parameter, a environment variable or a shortcut in locale module.

    It's not difficult to change, but you would have to remember how to do it and what LC_* variable you should change. Assuming this gets implemented it would most likely require a command line parameter and an envvar too.

    Anyway, people will not necessarily be faced by default with the localized version, an if for example, a teacher has to jump to an student machine, surely it could use it as messages will be probably in the spoken language of the country (BTW, probably most of the operating system components will be localized, not only Python)

    FWIW I've been in the situation where neither my students nor I could understand the local language -- luckily all the machines were using English.

    If PostgreSQL and other tools could do that, why Python could not?

    Does any other popular programming language do it? And if so, how?

    918f67d7-4fec-4a8d-93e3-6530aeb1e57e commented 12 years ago

    Unless Python's grammar is translated into other languages I'm -1 on this. I don't see any use of this. You anyway have to know English to understand the docs and Python's grammar is English. @Ezio melotti

    > In some places (like my country, public schools), English is not > teach formally until the University.

    This is very unfortunate -- I was under the impression that teaching English in middle/high schools was already common in most of the countries.

    English is actually oppressing other languages. Schools put a priority on English but not on native languages. Languages must be preserved because they contain culture (in one language, the future is behind because you can't see it and the past is in front of you.)

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 12 years ago

    Schools put a priority on English but not on native languages. Languages must be preserved because they contain culture

    Of course, but the main goal of a language is to communicate.

    As it stand, English is the language which is the most likely to be understood by Python users and developers (I don't count Esperanto ;-).

    It'll get tricky if in a couple months, we start getting bug reports with traceback in Finnish or French...

    terryjreedy commented 12 years ago

    It'll get tricky if in a couple months, we start getting bug reports with traceback in Finnish or French...

    That is another reason to *always output the standard English message first. I think this was discussed a couple of years ago on PyDev, or maybe another issue. I think the proposal to *replace English messages should be rejected.

    pitrou commented 12 years ago

    You anyway have to know English to understand the docs and Python's grammar is English.

    I don't think Python's grammar is relevant. I took my first steps in programming when I was around 10 and I barely knew English at the time, it didn't stop me from mastering "FOR i = 1 TO N ;... NEXT i" :-)

    2134dcd1-7c52-4a75-af8d-a177d48a95ea commented 12 years ago

    Sorry for taking so long to replying, and for this long follow up...

    Antoine Pitrou added the comment:

    I think the PEP should be proposed on python-dev or python-ideas. Also, it's probably better if the PEP is encoded in utf-8, not latin-1.

    Ok, I'll update, polish, encode in utf-8 and send to python-dev It was already discussed in Python-ideas (maybe not in particular/detail), but it seems that no one have more to add there, or they are bussy with the Async API :-).

    Terry J. Reedy added the comment:

    I am sympathetic with non-English speakers wanting a native-language translation.

    Sympathetic is a kind of compassion? It may be a correct meaning here. Just read that almost every of you complain arguing that you'll hesitate because you can/could receive a message in a foreign language that could not understand. Well, that's is what is already happening to non-English speakers of the python language IMHO. And it is not just frustrating, sometimes it is also a wasting of time because of the distractions and delays it produces.

    But I think the interpreter should *always* emit the standard message and that any translation should be an addition, not a replacement. This would maintain discoverablity and help people learn the English version, not hinder it.

    I'll explore the alternatives to show both messages (original and translated), but I think that would be more confusing. I do not think that it hinders the meaning, it just translates it, and I didn't see any other language / tool that puts both messages, but I'll investigate more (maybe the exception name -that is untranslated- plus an error code like in PostgreSQL would be more helpful to discoverablity)

    Learning English by showing both messages may be a interesting experiment, but, for me, it's like traditional education focus on "memorizing" things instead of understanding them, and depending on the context, it can be lead to good results or misleading repetition.

    The real question to me is how deep in the interpreter such support should go. Third party shells can (and sometimes do) intercept tracebacks and reformat (and translate) as they wish. But there would be advantages and disadvanteges is adding the translation sooner.

    About except hook approach, it doesn't work very reliable because you don't have the original unformatted message, so you have to interpolate the results to find the correct translation. Beside that it will be slower and it could be error prone, the main problem is not technical, but "social", as it could lead to translation effort duplication, segregation and proliferation of custom tools, with the aggravation that in some scenarios except hook is not honoured:

    http://bugs.python.org/issue12643 (just an example)

    You can take a look at one of my attempt trying to translate using interpolation (my algorithm is some kind of brute force "guessing" using regular expressions just to test the idea):

    http://code.google.com/p/pydiversity/source/browse/__diversity__/__init__.py

    I think that approach "left in the wild" (and/or "do it yourself") is not only more complex, also it could be more dangerous that having a unified translation resources, where all messages all listed, a common infrastructure is used and general rules are agreed.

    Ezio Melotti added the comment:

    There are two solutions to this problem: 1) adapt the language to the users; 2) teach the users English;

    While the first (i.e. what you are proposing) works as a short term solution, I believe the second is a much better long term solution, because IMHO users will anyway have to learn English sooner or later.

    Teach the users English may be an altruist goal in the long term, but for many teachers (like my case) it a barrier right now that can tip the balance to other "more friendly languages"

    Anyway, and don't get me wrong, but, force novice users to learn a second language, aside it is likely impractical, it may sound at least rude, ethnocentric or as a neocolonialism in some contexts (if we want to go further...). Education takes a lot of resources, I don't think it would happen just showing some English messages (BTW, English may be one of the most difficult languages to learn as a second tongue, depending on the part of the world you live... at least in my country you only can archive an acceptable skill before 6 to 9 years, depending your age and other socio-economical factors)

    IMHO, it would be more encouraging a message like "we can help you in your first programming steps with python localized for your language, but please consider to learn English to better communicate in the international community"

    I've seen buildbots reporting unintelligible error messages in German, and just a few days ago I even came across a mercurial version in Russian.

    Well, I think this only reinforces my point. Ignoring that many people out there are localizing their products/projects will not solve that problem either.

    Being more aware of internationalization tools could do the trick, not only for python tracebacks, also for third-party modules/libraries that are currently translating their messages.

    It makes somewhat sense to translate OS error messages, because they are read by regular users that have a localized OS and expect localized messages. The same could be said for bash, even if the distinction between "regular users" and "developers" starts to fade a bit here.

    Exact. Please, consider that 11-year old pupils learning to program a Robot or or 16-year old students trying to understand a simple algorithm, are users too.

    Many maybe will continue with the IT / CS career, if we do a good work ;-) Those that continues working on programming will surely be exposed sooner or later to formal technical English course at University or similar. But, if they don't continue their studies, or choose a different career, maybe their English skill will never be enough.

    For example the other day I saw a student confused by this error message: >>> a, b = 1, 2, 3 ValueError: too many values to unpack (expected 2)

    The offender here is most likely the word "unpack". "Unpack" is closely related to the concept of tuple unpacking, so if the student is aware of what tuple unpacking is, he might fail to associate the problem with it if the error uses another word. In addition, I can not think of any word that might be a suitable translation for "unpack" in my native language. In Spanish "desempaquetar" could maybe be used, but I'm not sure how well it works.

    Enhancing some cryptic exception messages can be a parallel job and could be beneficed from the different points of view that opens internationalization in different languages. Here, as you point, translation poses a new perspective, why take that as a threat instead of an opportunity to bring better messages?

    I don't agree that this is a beginner-only problem. I remember an occasion where finishing a coding-dojo, a syntax error was raised and the attendee could not complete their work. We (a University Teacher, a Teaching Assistant and a Core Developer!) spent a lot of time to discover that it was caused by changing 07 to 08. Of course, the "SyntaxError: invalid token" was not helpful because the nice editor didn't printed the traceback in a correct monoespaced font (btw, it take a while to understand why the ^ was pointing the 8)... Anyway, a better error like "SyntaxError: invalid token for octal representation" would be more helpful, even in English ;-)

    So the argument "the exception must be in English" to be able to google it may be weak at least (apart from some incompleteness, exception message depends on its context and is just a part of the whole traceback, they change over the time on some occasions, and they may be difficult to copy&paste correctly...).

    In the other hand, localized error messages could eventually produce better search results for non-English speakers if there is enough material written in they language.

    You can take a look at this and other nice examples I've recollected so far (incomplete, of course):

    http://python.org.ar/pyar/MensajesExcepcionales

    > The mechanism to restore the language is the common one ...

    It's not difficult to change, but you would have to remember how to do it and what LC_* variable you should change. Assuming this gets implemented it would most likely require a command line parameter and an envvar too.

    First: this proposal doesn't enables translation per default

    Second: if this proposal get implemented, you don't need even to install the translation files! (so translation cannot be turned on by accident)

    Third: isn't this a small price to pay for advanced users (just changing a locale setting, if ever are required to do so), in comparison that it could open/enhance python to new languages and users?

    > If PostgreSQL and other tools could do that, why Python could not?

    Does any other popular programming language do it? And if so, how?

    Does other popular language uses indentation instead of brackets?

    I doubt topics like this can be compared directly (due some differences in communities, goals, etc.), but I've did a quick search and I found this:

    .NET support internationalization with a special mechanism called Culture (similar to locale/gettext) AFAIK it is done by default and embedded in the platform and system libraries. They even have a "exception message design guideline" where they says "Localizing the error message helps non-English speakers feel more comfortable on our platform. " Some argues that is not easy to disable this feature to get English only message. Other pitfalls are that if the resource file is missing, it can produce incorrect error messages instead of showing the English one (both things should not happen here as gettext is somewhat more manageable) Other benefits/objections presented are similar to the ones discussed here, for example:

    http://blogs.msdn.com/b/brada/archive/2004/01/28/64255.aspx

    Java has a Throwable.getLocalizedMessage(), hence, the dual approach, but AFAIK it is mainly unimplemented (at least for system libraries and internal exceptions). It is not an automatic approach (its depend on ResourceBundle, MessageFormat, etc) so it is not easy to implement anyway.

    Ramchandra Apte added the comment:

    Unless Python's grammar is translated into other languages I'm -1 on this.

    I think python grammar is not comparable to English grammar. I've already pointed, for example, a similar approach like in PostgreSQL where SQL sentences were not translated.

    I don't see any use of this. You anyway have to know English to understand the docs and Python's grammar is English.

    Keywords are similar to English words, but that is all (some are even not English words like __rmul__) You don't need to know how to write correct English sentences to write python code, or am I missing something? Punctuation is different too.

    That most of documentation is not translated is not an excuse to me. At least some part should be translated too, as for example, the Python Tutorial was translated by the local community to Spanish:

    http://docs.python.org.ar/tutorial/contenido.html

    Terry J. Reedy added the comment:

    >It'll get tricky if in a couple months, we start getting bug reports >with traceback in Finnish or French...

    That is another reason to *always output the standard English message first. I think this was discussed a couple of years ago on PyDev, or maybe another issue. I think the proposal to *replace English messages should be rejected.

    I think it is unlikely to receive a bug report from any person that turns on translation because he doesn't understand the language. How could he/she even write the email/issue in the first place?

    Again, this can be seen as an opportunity to foster local/regional users groups, not as a disadvantageousness. It will be easier for an non-English speaker to communicate in their own tongue with regional advanced users, and then they can find a way to help him to submit the bug report in English, if appropriate.

    IMHO this can improve the language and their international community via cooperation and diversity.

    918f67d7-4fec-4a8d-93e3-6530aeb1e57e commented 12 years ago

    I said

    > Schools put a priority on English but not on native languages. Languages must be preserved because they contain culture Charles-François Natali said Of course, but the main goal of a language is to communicate.

    As it stand, English is the language which is the most likely to be understood by Python users and developers (I don't count Esperanto ;-).

    It'll get tricky if in a couple months, we start getting bug reports with traceback in Finnish or French... I was saying in general.

    ezio-melotti commented 12 years ago

    Teach the users English may be an altruist goal in the long term, but for many teachers (like my case) it a barrier right now that can tip the balance to other "more friendly languages"

    Students are not required to learn English and English grammar before programming -- learning a few words is already enough to grasp the meaning of many error messages (and it's anyway already necessary for keywords, functions, methods, classes, modules, etc.).

    Those that continues working on programming will surely be exposed sooner or later to formal technical English course at University or similar.

    The sooner they get exposed to English the better it is. The best way to learn a language is by using it, and IMHO technical English is even easier than "normal" English (and often even than native language).

    But, if they don't continue their studies, or choose a different career, maybe their English skill will never be enough.

    I think that nowadays anyone should learn English anyway, and the more you translate the more you make their lives difficult, because you confine them to a restricted subset of all the available information (this is getting off-topic though).

    Here, as you point, translation poses a new perspective, why take that as a threat instead of an opportunity to bring better messages?

    This is a different problem though. Python (and programming in general) has its own jargon, and the jargon provides a concise way to refer to specific concepts (e.g. tuple-unpacking). While it certainly shouldn't be abused, it's often more convenient to use it. Creating a new localized jargon also doesn't help, and it only makes things more complicated.

    For example you mentioned the "invalid token" error message, that on the page you linked is translated as "token inválido". AFAIK "token" is not a Spanish word, so either you end up leaving the jargon untranslated, or you translated in something that doesn't make much sense and doesn't match with other names (e.g. the tokenize module). While improving the message is a good idea (if/when possible), translating it doesn't make things much better.

    localized error messages could eventually produce better search results for non-English speakers if there is enough material written in they language.

    IME experience it's the opposite. This might work a bit better for widespread languages like Spanish, but otherwise I often come across to fairly bad results. First of all localized documentations are not as updated as the official one (that gets updated daily), and the translation might not be accurate. Given that the English community is the biggest one, it's also easier to find answers in English than it is in any other language. People that are not using English resources are often inexperienced users, and the solutions they provide are not always good (of course there are exceptions).

    At least some part should be translated too, as for example, the Python Tutorial was translated by the local community to Spanish: http://docs.python.org.ar/tutorial/contenido.html

    But this is just a part, has not been updated in over 2 years, and doesn't even cover Python 3.

    pitrou commented 12 years ago

    > Those that continues working on programming will surely be exposed > sooner or later to formal technical English course at University or > similar.

    The sooner they get exposed to English the better it is. The best way to learn a language is by using it, and IMHO technical English is even easier than "normal" English (and often even than native language).

    This sounds like wishful thinking to me. Regardless of whether it's *better* for a fledgling programmer to learn and improve their English, we can still improve Python right now for those who don't master English.

    > But, if they don't continue their studies, or choose a different > career, maybe their English skill will never be enough.

    I think that nowadays anyone should learn English anyway, and the more you translate the more you make their lives difficult, because you confine them to a restricted subset of all the available information (this is getting off-topic though).

    That will be true if translations are enabled by default, not if they need some explicit configuration switch to be enabled.

    > Here, as you point, translation poses a new perspective, why take > that as a threat instead of an opportunity to bring better > messages?

    This is a different problem though. Python (and programming in general) has its own jargon, and the jargon provides a concise way to refer to specific concepts (e.g. tuple-unpacking). While it certainly shouldn't be abused, it's often more convenient to use it. Creating a new localized jargon also doesn't help, and it only makes things more complicated.

    Well, even technical tools like gcc or Mercurial have translations these days (not always very good ones, admittedly, but I don't see anyone advocating for these translations to be removed).

    > At least some part should be translated too, as for example, the > Python Tutorial was translated by the local community to Spanish: > http://docs.python.org.ar/tutorial/contenido.html

    But this is just a part, has not been updated in over 2 years, and doesn't even cover Python 3.

    That's not really a problem. People teaching Python in a language other than English can certainly create their own teaching resources (and, ideally, share them on the Internet :-)).

    ezio-melotti commented 12 years ago

    This sounds like wishful thinking to me. Regardless of whether it's *better* for a fledgling programmer to learn and improve their English, we can still improve Python right now for those who don't master English.

    True, but my point is that while this has some short term benefits, it's worse on the long term, and potentially even more confusing.

    That will be true if translations are enabled by default, not if they need some explicit configuration switch to be enabled.

    They usually are (at least for other tools like mercurial). I can see OS vendors enabling this by default.

    Well, even technical tools like gcc or Mercurial have translations these days (not always very good ones, admittedly, but I don't see anyone advocating for these translations to be removed).

    Maybe someone advocated for them not to be added? :)

    That's not really a problem. People teaching Python in a language other than English can certainly create their own teaching resources (and, ideally, share them on the Internet :-)).

    Or use the already existing resources found on Internet (mostly in English :))?

    bitdancer commented 12 years ago

    It seems to me that if as Terry suggests both the English and the translation are output then most of Ezio's concerns would be addressed. Maybe we could even start a new trend in error message internationalization :)

    An important question that needs to be addressed is what additional burden this places on maintaining Python (keep in mind that it looks like we'd be the first programming language to do this), and who is going to do the maintenance.

    Mariano, how committed are you to maintaining it long term?

    This is clearly going to need a PEP, so if the python-ideas discussion has already happened, that is really the next step. We aren't going to resolve the yes or no question (not to mention the details) for something this fundamental without a PEP, I think.

    birkenfeld commented 12 years ago

    How do you propose to output both versions? It will make tracebacks twice as long.

    ezio-melotti commented 12 years ago

    If only the error gets translated it might look more or less like:

    Traceback (most recent call last):
      File "form.py", line 78, in <module>
        f = Form("factura.csv")
      File "form.py", line 12, in __init__
        for linea in open(infile).readlines():
    IOError: [Errno 2] No such file or directory: 'factura.csv'
             [Errno 2] No existe el archivo o directorio: 'factura.csv'

    @RDM The PEP is already attached to the issue :)

    @Mariano Do you think it's possible to make this an external module? Otherwise we could provide some hooks, and have language packages maintained by the community on PyPI.

    bitdancer commented 12 years ago

    I didn't look at the patch...I assumed we were only talking about the message. There seems little point in translating any other part of the traceback, other than maybe the boilerplate between traceback sections. My thought was that the translation text for a message X would look something like

    X (translatedX)

    or vice versa.

    Note that I'm not advocating in favor of either doing this at all, or doing the dual message display. I'm currently neutral on both.

    bitdancer commented 12 years ago

    Ezio: ah. I guess I don't read the Description column of attachments, only the filename :)

    terryjreedy commented 12 years ago

    What Ezio suggested, plus maybe the error name -- but people must learn those to write exception statements, and a translation of the exceptions section of the lib manual is really needed to understand them.

    The boilerplate lines "Traceback (most recent call last):" and "File {}, line {}, in {}" can be learned" (or even translated, though 'traceback', 'call', 'file', 'line', and 'in' pretty much need to be learned anyway).

    Once a hook is added and a file format is defined, if such are, then I think core-developer responsibility ends and language-specific files should be on pypi and maintained by language-specific groups.

    terryjreedy commented 12 years ago

    An additional reason to keep English error messages always is that they are not just printed. They are part of the args tuple attribute of exception objects. Code that processes exception messages could break if the English version is replaced. Doctests are only one example. This consideration suggests that translations, if done at exception creation, should be an optional last member of args or an optional attribute, such as .altmsg. I agree than any change should follow PEP approval.

    rhettinger commented 12 years ago

    I'm going to close this one. It *really* would need a PEP before going forward. There are many issues to consider: being able to google for an exception message, doctest issues, maintainability issues, etc.

    rhettinger commented 12 years ago

    The text is now correct and matches the spec: http://speleotrove.com/decimal/daops.html#refremnear

    The doctests should be expanded to show all of the examples listed in that document:

    remainder-near(’2.1’, ’3’) ==> ’-0.9’ remainder-near(’10’, ’6’) ==> ’-2’ remainder-near(’10’, ’3’) ==> ’1’ remainder-near(’-10’, ’3’) ==> ’-1’ remainder-near(’10.2’, ’1’) ==> ’0.2’ remainder-near(’10’, ’0.3’) ==> ’0.1’ remainder-near(’3.6’, ’1.3’) ==> ’-0.3’

    terryjreedy commented 12 years ago

    Raymond, did you mean to send that to another issue?

    stevendaprano commented 9 years ago

    For what it's worth, there are at least two localised versions of Python: Teuton and ChinesePython. As far as I know, ChinesePython is still in active development. Both translate the keywords and builtins, to German and Chinese respectively. I don't have a link, but I recall that Guido gave his blessing for ChinesePython to call itself a Python implementation.

    There's a localised, Portuguese version of Stack Overflow:

    http://blog.stackoverflow.com/2014/02/cant-we-all-be-reasonable-and-speak-english/

    so I think that the days when all programmers must learn English are slowly fading away, just like the days when all mathematicians had to learn German.

    But, I agree with those who say that a PEP is necessary. There are a lot of factors to consider. (Although of course as a third-party library, no PEP would be needed.)

    2134dcd1-7c52-4a75-af8d-a177d48a95ea commented 9 years ago

    Just for the record, I've presented a "CPython Internationalization proposal" for this year Google Summer of Code program:

    http://www.google-melange.com/gsoc/proposal/public/google/gsoc2015/reingart/5634387206995968

    Indeed, that was my third attempt to move it forward (you can look there for the implications, schedule, etc.)

    Anyway, it didn't get accepted (and I have no more feedback from GSoC than that), so I will not be able to focus on this and finish it in 3 months as planned, but I'll do my best.

    Currently I'm cloning the CPython repository under my GitHub account to work on it (as I would have done if the GSoC project was approved):

    https://github.com/reingart/python/

    It was exported using hg-git so it can be easily updated or get collaborations back with mercurial, just using GitHub to publish it.

    BTW, the PEP was hanging around since 2010 (see the attached file in 2012 for example), now I uploaded it in GitHub so it can be collaboratively edited:

    https://github.com/reingart/python/wiki/PEP-i18n

    I will re-organize / re-base the patch and update the PEP ASAP.

    PS: Yes, it seems that "the days when all programmers must learn English" are fading away, Visual Basic 5 was internationalized around 20 years ago (indeed, I learned VB as one of my first "real" languages as it had completely Spanish translations for errors and online built-in F1 help, in a CDROM those days). The first "logo" programming language my father brought to may home around 30 years ago, also was in Spanish IIRC (for a Spectrum TK90). Even gcc and bash are internationalized nowadays :-)