rust-lang / www.rust-lang.org

The home of the Rust website
https://www.rust-lang.org
Apache License 2.0
370 stars 289 forks source link

Consider renaming "zh-TW" language label from "正體中文" to "傳統中文" to address potential controversies #2028

Closed ghost closed 2 months ago

ghost commented 2 months ago

What needs to be fixed?

The term "正體中文" and "正體" could be problematic as a translation for "Traditional Chinese" because:

For example, Hong Kong's Traditional Chinese is viewed as "正體" (orthodox form) in Hong Kong, although Hong Kong's Traditional Chinese is slightly different from Taiwanese Traditional Chinese / Taiwanese Mandarin.

Similarly, American English, Canadian English, British English, etc. each have their own orthography, so labeling any one of them as "Orthographic English" would be controversial and potentially offensive to speakers of other varieties.

Existing Controversies

It's important to note that there are ongoing debates surrounding the terminology used for Chinese character sets:

These controversies highlight the sensitivity and complexity surrounding the naming of Chinese character sets, further underlining the need for a neutral term.

Page(s) Affected

The title of "zh-TW" in language-footer selection of every page.

Suggested Improvement

A more accurate and neutral translation would be "傳統中文". This is exactly the literal translation of "Traditional Chinese" and is more widely accepted and commonly used across nearly all Chinese-speaking regions. Choosing the term "傳統中文" acknowledges the historical aspect of these character forms without implying any superiority or standardization. This neutral term promotes better understanding and communication across different regions and cultural backgrounds. It also avoids the long-standing debate over traditional and simplified Chinese characters.

Manishearth commented 2 months ago

Previously: https://github.com/rust-lang/www.rust-lang.org/issues/1845, please don't open duplicate issues.

I'm somewhat open to changing this but the core dilemma is that both 正體 and 繁體 express a value judgement and people have opinions on that. I've not seen 傳統 often, the word means "traditional" but I've mostly seen it with 中文 in Cantonese contexts.

We picked 正體 because it is the endonym, and generally endonyms are preferred when talking about languages. We currently do not have active translator teams so we default to sticking to our original choices. Any change here would need sign-off from someone who is known to the Rust organization and is a native user of the language (including script choice) being relabeled.

What we do not want is to switch to something that is weird, unrecognizable, or insulting to the actual users of this language/script. The preference for standard endonyms avoids this, and generally with languages picking an exonym is a big no-no.

I might ask around, and I'll keep this issue open, but no guarantees.

ghost commented 2 months ago

Previously: #1845, please don't open duplicate issues.

I'm somewhat open to changing this but the core dilemma is that both 正體 and 繁體 express a value judgement and people have opinions on that. I've not seen 傳統 often, the word means "traditional" but I've mostly seen it with 中文 in Cantonese contexts.

We picked 正體 because it is the endonym, and generally endonyms are preferred when talking about languages. We currently do not have active translator teams so we default to sticking to our original choices. Any change here would need sign-off from someone who is known to the Rust organization and is a native user of the language (including script choice) being relabeled.

What we do not want is to switch to something that is weird, unrecognizable, or insulting to the actual users of this language/script. The preference for standard endonyms avoids this, and generally with languages picking an exonym is a big no-no.

I might ask around, and I'll keep this issue open, but no guarantees.

Thank you for your thoughtful response. I agree that both "正體" and "繁體" can express value judgments, which complicates the naming issue. The diversity of Chinese language users adds further complexity.

To address this, I propose an alternative approach inspired by Apple's website:

  1. Instead of selecting a language, users choose their region.
  2. Under an "Asia Pacific" category (or similar), we could list options like "Singapore", "中国大陆", "台灣", "香港", etc. Here is the example structure:

Asia Pacific

  • Indonesia
  • Singapore
  • 台灣
  • 香港
  • 中国大陆
  • etc.

This approach:

While this requires more code changes, it could provide a long-term solution that addresses current and potential future issues with language/region labeling.

I'm interested in your thoughts on this approach. Do you see any potential benefits or drawbacks?

Thank you again for your consideration and ongoing efforts to make Rust inclusive and accessible to all users.

Manishearth commented 2 months ago

We do not have that number of translations, and also that goes against other best practices for listing languages.

Apple's website does this because that is actually a region selector. That website performs commerce, it needs to know what country you're in. The Rust website does not.

Manishearth commented 2 months ago

I've asked around and I'm electing to stick to the current choices.

Fundamentally, a strong guiding rod for choosing language names is that the name must first and foremost be acceptable to the users of that language. Ultimately, users of another language/writing system do not get to dictate what a language is called by its own users.

It's actually rather common for language names to mean something that speakers of another language will recognize as far broader than the language. My go to example for this is "Bahasa" (Indonesian), which ... just means "language". Not just in Bahasa itself, but in most languages of India! The various Goidelic languages all have an endonym that mostly just means "Gaelic". A lot of languages are named "language", a lot of writing systems are just named "writing" and here we're seeing a similar phenomenon. Things need to be done more carefully in case of actual ambiguity (e.g. Bahasa Indonesia vs Bahasa Melayu), but there isn't sufficient ambiguity here, and it appears that there's basically only one accepted endonym for traditional chinese in Taiwan.

Now that I'm thinking back I do recall us explicitly discussing this when we started translating the website, and apparently this used to be a recurring topic of contention in Mozilla circles, but overall has a relatively established conclusion of using 正體 there.

ghost commented 2 months ago

I've asked around and I'm electing to stick to the current choices.

Fundamentally, a strong guiding rod for choosing language names is that the name must first and foremost be acceptable to the users of that language. Ultimately, users of another language/writing system do not get to dictate what a language is called by its own users.

It's actually rather common for language names to mean something that speakers of another language will recognize as far broader than the language. My go to example for this is "Bahasa" (Indonesian), which ... just means "language". Not just in Bahasa itself, but in most languages of India! The various Goidelic languages all have an endonym that mostly just means "Gaelic". A lot of languages are named "language", a lot of writing systems are just named "writing" and here we're seeing a similar phenomenon. Things need to be done more carefully in case of actual ambiguity (e.g. Bahasa Indonesia vs Bahasa Melayu), but there isn't sufficient ambiguity here, and it appears that there's basically only one accepted endonym for traditional chinese in Taiwan.

Now that I'm thinking back I do recall us explicitly discussing this when we started translating the website, and apparently this used to be a recurring topic of contention in Mozilla circles, but overall has a relatively established conclusion of using 正體 there.

Thank you for your detailed response. I respectfully disagree with some points and would like to clarify:

  1. Decision-making process: While "asking around" is a start, it may lead to biased results due to limited sample size and potential echo chamber effects. A more comprehensive, unbiased survey of diverse Chinese language users would provide a more accurate representation.
  2. Common Practice in Open Source: Upon investigation, the use of "正體" seems to be an exception rather than the norm. Many major open-source organizations and programming language websites use "繁體" or avoid translation altogether:
  1. User Base Overlap: Traditional and simplified Chinese users are not distinct groups. Many use both systems interchangeably, depending on context. They are different writing systems of the same language, often used by the same individuals.
  2. Meaning of "正體": Unlike "Bahasa" (meaning "language"), "正體" implies "standard", "orthodox" or "correct", which carries potential value judgments. This is fundamentally different from a neutral descriptor.
  3. Inclusivity in Open Source: As a global project, Rust has an opportunity to lead in inclusivity. This might mean reconsidering established practices if they could unintentionally exclude or offend some users.
  4. Alternative Approach: Consider using regional labels instead of language names, or adopt the more widely used "繁体中文". This approach:
    • Aligns with common practices in the open-source community
    • Avoids potentially controversial language labels
    • Offers flexibility for future additions without risking controversies -Aims for neutrality, especially where language names are sensitive

Given these points, especially the practices of other major open-source projects, would it be possible to reconsider this decision? Perhaps we could conduct a more comprehensive survey of Chinese language users from various regions to ensure we're making the most inclusive choice.

Manishearth commented 2 months ago

Decision-making process: While "asking around" is a start, it may lead to biased results due to limited sample size and potential echo chamber effects. A more comprehensive, unbiased survey of diverse Chinese language users would provide a more accurate representation.

It's worth noting: we previously went through discussions about this as a project. This was me just checking back since I didn't remember everything and wanted to make sure I didn't miss something.

A more comprehensive, unbiased survey of diverse Chinese language users would provide a more accurate representation.

No, I'm sorry, I disagree here. The people who get to decide what a language (or language/script pair) is called are the speakers of that language. This particular website translation is in Traditional Chinese using Taiwanese Mandarin, and that's the user community we check with.

There are many valid reasons within a language community to change the words that are in use. If the Taiwanese community expresses a desire to move away from this term, we can reflect that. However, I do not think there are valid reasons for other groups to dictate what words they should use. We're being descriptive here, not prescriptive.

If translation teams for other Sinophone regions wish to contribute websites for their regional lect, we could probably make something work; but that's distinct from what this specific language/script pair is called by its users.

User Base Overlap: Traditional and simplified Chinese users are not distinct groups. Many use both systems interchangeably, depending on context. They are different writing systems of the same language, often used by the same individuals.

To some extent, I agree, but there's a lot more nuance here, and I still fall back to treating these as separate but overlapping communities based on my understanding of the situation.

In particular:

different writing systems of the same language

has so many layers of nuance to it, having to do with the standardization of something-like-written-Mandarin as cross-lect "Chinese", and the relative delegation of other 方言/lects to being second-class citizens.

Manishearth commented 2 months ago

As far as prior art is concerned, one thing that was brought up in the original discussion was that Mozilla does this too, and it is something that has been discussed ad nauseum in Mozilla spaces, with the conclusion of using 正體. Knowing it was a conscious choice made after weighing options carries a lot more weight there.

ghost commented 2 months ago

As far as prior art is concerned, one thing that was brought up in the original discussion was that Mozilla does this too, and it is something that has been discussed ad nauseum in Mozilla spaces, with the conclusion of using 正體. Knowing it was a conscious choice made after weighing options carries a lot more weight there.

It is a bit humorous to refer to a standpoint privately agreed upon only by some of Mozilla and Rust community as “a conscious choice”, especially considering that almost all other open source communities and commercial platforms NEVER use "正體" translation.

Besides you've been avoiding the key question: "正體" means "standard", "orthodox" or "correct", which carries value judgments. This is fundamentally different from a neutral descriptor.

And you still choose to stick to a biased translation term, while other open source communities and commercial platforms NEVER use it. This behavior difference somehow demonstrates the community's stance, doesn't it?

Manishearth commented 2 months ago

Besides you've been avoiding the key question:

No, I'm not. The value judgement is irrelevant if that is what most people who speak Taiwanese Mandarin in Traditional Chinese call it. It's irrelevant if a translation that is not primarily intended for you carries something you perceive as a value judgement. I've stated that I consider it irrelevant already, I'm not avoiding the question.

It is a bit humorous to refer to a standpoint privately agreed upon only by some of Mozilla and Rust community as “a conscious choice”, especially considering that almost all other open source communities and commercial platforms NEVER use "正體" translation.

As I understand it, these were public PhpBB board discussions from decades ago with Mozilla. I'm not sure where they are now, or if those forums are even online anymore, but they happened, and it was the same flamewar over and over again.

As for the Rust side, even if the standpoint was privately agreed on by Rust team members, that's still a conscious choice made weighing the tradeoffs.

I'd be happy to engage with choices made by other open source communities if their reasoning was clear. Til then, I don't have reason to believe it was anything other than people filing issues like this, that make what I consider to be an insufficient argument since they fundamentally come from an external viewpoint.

Manishearth commented 2 months ago

I think I've said all I can say on this topic and will probably not be responding further, because I seem to be repeating myself. I'd be open to this being discussed further if we had robust translation teams known to the project again (we used to, and might in the future when the compiler is being translated).