mdn / yari

The platform code behind MDN Web Docs
Mozilla Public License 2.0
1.18k stars 501 forks source link

Create Traditional and Simplified Chinese conversion tools #4530

Closed kecrily closed 3 years ago

kecrily commented 3 years ago

Summary

There is no big gap between Traditional and Simplified Chinese, we can convert a source content from Traditional and Simplified Chinese to Traditional Chinese or Simplified Chinese by using the Traditional-Simplified tool, just like the Chinese Wikipedia.

Why

There are now two different branches of MDN zh, zh-cn and zh-tw. The translation results for the same document are roughly the same for both.

A Traditional Chinese user can read documents in Simplified Chinese almost without any problems. And vice versa.

The complete segregation of zh-cn and zh-tw translations results in unnecessary duplication of effort. We can avoid the waste of manpower by using a conversion tool. Let a translated content composed of both traditional and simplified Chinese be converted into an entirely zh-cn or zh-tw content by the tool.

What needs to be done?

Some project that might help

OpenCC

tso1158687 commented 3 years ago

先說結論,語言不是只有繁簡轉換那麼簡單,更多要克服的是詞彙空缺、風俗、文化的不同。 如果想要了解更多,建議可以參考語言學、語義學、語用學。

所謂的翻譯不僅是文字轉換那麼簡單。不僅是英文翻譯成中文,還有繁體與簡體的互相轉換。 有些詞語有一對一的對應,可以轉換沒有問題,像是:硬碟/硬盤、記憶體/內存、雪梨/悉尼、執行緒/線程等等,只要寫個對應的表就可以轉換。

還有以下問題:

1.詞彙空缺 但是有些語詞是帶有文化背景的,只有在那個文化背景之下才有那個語詞,就算要翻譯,另外一個文化之下就是找不到對應的詞,稱為詞彙空缺。

舉例來說,像是程序猿,如果要轉換,很難在台灣用的繁體中文對應到一個詞彙。正式的說法是「程序員」,使用「猿」就有戲謔或是自嘲的意思。再來「員」是屬於一種階級的單位,不過這樣的階級單位只首創於解放軍的階級當中,有強調人人階級平等的意思,像是煮飯的稱為「炊事員」、指揮軍隊的稱為「司令員」,但是在國軍中,可以從語尾看出階級的高低,就會稱為「伙房兵」、「指揮官」,這不只是單純繁簡轉換的問題,這是文化背景不同,導致用語不同。所以回到程序猿,這個在台灣的繁體中,很難完全對應上「工程師」這個詞彙,要轉換也找不到有相同語境的詞語。反過來說,台灣也有很多詞彙,轉換成中國用語也有詞彙空缺的現象。

這不是寫程式轉換就好的問題,翻譯是一門專業的學問,應該嚴肅分析與看待。

2.語用情境 另外一個問題是語用情境不同,hotel在中國可以翻譯成酒店,如果直接轉換成繁體也會是酒店,但是在台灣酒店是指喝酒的地方,並不是住宿的地方,同樣的語詞,在不同情境之下,有不同的意思,要如何對應,這不是寫在字典就可以規範的,因為語詞的意思會隨著時間空間的不同而改變,這個就是語詞內在義與外在義的活力。

另外語詞就算有對應,但是不符合當地社會的脈絡,同樣的語詞不一定會有相同的感覺。

光是罵人的詞彙就不一樣的,例如以下圖片的翻譯: https://i.imgur.com/ZNCQ4s8.jpg 如果翻譯成簡體中文可能翻譯為:握草、你他媽這什麼,直接轉譯為繁體中文,台灣人不會看不懂,但平常台灣人罵人不會這樣罵,這少了語境的意思。 反過來說,繁體中文可能:幹你娘、這三小,翻譯成簡體中文也未必看得懂,因為只有台灣人會這樣罵

綜合以上,在詞彙空缺、語用情境的問題之下,翻譯不是寫程式轉換,還有上述問題需要克服 在面對這些問題,請問有解決方案嗎?

seventhmoon commented 3 years ago

I think we should consider localization here. Terminologies are different in different region. e.g. Taxi = 的士 (zh-Hant-HK) = 計程車 (zh-Hant-TW) = 出租車 (zh-Hans-CN)

To maintain / improve the readability, we should, instead, towards another direction.

lubatang commented 3 years ago

You can not translate this two languages just like word-to-word. They are different not only in words but also in grammar and in terminology.

sammyfung commented 3 years ago

I suggest the administrator close this issue because I think the proposal and discussions in this issue reflect different political mindsets (between traditional and simplified Chinese) more than a real-world issue.

The proposer is a native simplified Chinese user but he proposes something here also for another 'language' (locale) (aka traditional Chinese) which he doesn't use in his life and his city/state/region.

And it is certain inputs from traditional Chinese users in Taiwan and Hong Kong to tell their opinions here and explain their concerns. So, we should not continue to process on this issue and just close this issue.

brianwchh commented 3 years ago

簡體字簡陋過頭了

You can disagree the idea of this issue but please respect the culture and languages of others.

你可以不同意這則 issue 作者的言論,但沒必要針對文字或文化進行攻擊。

do you know how many good people have been killed by chairman Mao, just because they opposed to the so-call simplified Chinese character revolution? it was a disaster to Chinese culture! As a mainland Chinese I do think the simplified / truncated Chinese is a lot of meaningless, one can tell a story from a traditional Chinese character, while the truncated and destroyed Chinese character become meaningless symbol ! It confused us a lot while we were learning this meaningless Chinese symbol. to be honest, the simplified Chinese is ugly compared to the traditional one. We should go back to traditional one to get our culture to be on the right track! It is a culture disaster lead by a political disaster, and the guy who forces his own will on the whole nation is chairman mao who was "a kind of a gate keeper" of a library before he join the army, I mean he is definitely not qualified to make decision for us ! even if he consider himself as a king! he is really really really not qualified! it is such a tragedy that we were born to learn and use simplified Chinese without a choice just because this guy!
when people defense the existence of simplified Chinese,they usually say that simplified Chinese exists for a reason! the only reason is killing a lot of good people and making the rest silent, so no one dare opposed to his decision,and people have to accept the so-called truncated Chinese without a choice!

zhusee2 commented 3 years ago

Please, just don't, try to "merge" the two languages.

I rely heavily on MDN docs both in Traditional Chinese and in English as a web developer. The docs are really great, thanks to the huge MDN community.

Reading docs in Simplified Chinese confuses me sometimes, however, because the terms can be quite different between Taiwan and China. Machine translation doesn't help much.

I know the term of "Traditional Chinese" and "Simplified Chinese" make them sound similar to the relationship as between "American English" and "British English", but the gap between two "Chinese" languages are way more significant than the English pair.

I would even suggest to rename them to "Chinese (Taiwan)" and "Chinese (China)" because it's more about region differences, rather than just Traditional/Simplified forms.

shugen002 commented 3 years ago

I know the term of "Traditional Chinese" and "Simplified Chinese" make them sound similar to the relationship as between "American English" and "British English", but the gap between two "Chinese" languages are way more significant than the English pair.

I would even suggest to rename them to "Chinese (Taiwan)" and "Chinese (China)" because it's more about region differences, rather than just Traditional/Simplified forms.

If you rename to Chinese(Taiwan), how about people in Hongkong or Macao ?

Not only people in Taiwan using Traditional Chinese , but also other Overseas Chinese are using .

Please, just don't, try to "merge" the two languages.

I rely heavily on MDN docs both in Traditional Chinese and in English as a web developer. The docs are really great, thanks to the huge MDN community.

Reading docs in Simplified Chinese confuses me sometimes, however, because the terms can be quite different between Taiwan and China. Machine translation doesn't help much.

I agree this part. As a Simplified Chinese user, when i have to read some Traditional Chinese , I also experience this kind of culture difference . Yes i can read most of traditional Chinese character , but it's the character look similar to some character in Simplified Chinese . And for some word , we need to guess what it mean .

shugen002 commented 3 years ago

簡體字簡陋過頭了

You can disagree the idea of this issue but please respect the culture and languages of others. 你可以不同意這則 issue 作者的言論,但沒必要針對文字或文化進行攻擊。

do you know how many good people have been killed by chairman Mao, just because they opposed to the so-call simplified Chinese character revolution? it was a disaster to Chinese culture! As a mainland Chinese I do think the simplified / truncated Chinese is a lot of meaningless, one can tell a story from a traditional Chinese character, while the truncated and destroyed Chinese character become meaningless symbol ! It confused us a lot while we were learning this meaningless Chinese symbol. to be honest, the simplified Chinese is ugly compared to the traditional one. We should go back to traditional one to get our culture to be on the right track! It is a culture disaster lead by a political disaster, and the guy who forces his own will on the whole nation is chairman mao who was "a kind of a gate keeper" of a library before he join the army, I mean he is definitely not qualified to make decision for us ! even if he consider himself as a king! he is really really really not qualified! it is such a tragedy that we were born to learn and use simplified Chinese without a choice just because this guy! when people defense the existence of simplified Chinese,they usually say that simplified Chinese exists for a reason! the only reason is killing a lot of good people and making the rest silent, so no one dare opposed to his decision,and people have to accept the so-called truncated Chinese without a choice!

What you said did not help to this issue or the problem you described. Please stop spam in this issue.

br90218 commented 3 years ago

簡體字簡陋過頭了

You can disagree the idea of this issue but please respect the culture and languages of others. 你可以不同意這則 issue 作者的言論,但沒必要針對文字或文化進行攻擊。

do you know how many good people have been killed by chairman Mao, just because they opposed to the so-call simplified Chinese character revolution? it was a disaster to Chinese culture! As a mainland Chinese I do think the simplified / truncated Chinese is a lot of meaningless, one can tell a story from a traditional Chinese character, while the truncated and destroyed Chinese character become meaningless symbol ! It confused us a lot while we were learning this meaningless Chinese symbol. to be honest, the simplified Chinese is ugly compared to the traditional one. We should go back to traditional one to get our culture to be on the right track! It is a culture disaster lead by a political disaster, and the guy who forces his own will on the whole nation is chairman mao who was "a kind of a gate keeper" of a library before he join the army, I mean he is definitely not qualified to make decision for us ! even if he consider himself as a king! he is really really really not qualified! it is such a tragedy that we were born to learn and use simplified Chinese without a choice just because this guy! when people defense the existence of simplified Chinese,they usually say that simplified Chinese exists for a reason! the only reason is killing a lot of good people and making the rest silent, so no one dare opposed to his decision,and people have to accept the so-called truncated Chinese without a choice!

Very unprofessional

Rumyra commented 3 years ago

I'm closing this issue so we have a chance to review it from the mdn side 👍

ccshan commented 3 years ago

How one of Chinese terms be ambiguous ?

https://tw.news.yahoo.com/%E9%AB%98%E7%AD%89%E6%95%B8%E5%AD%B8%E8%BC%83%E9%9B%A3-%E5%A4%A7%E9%99%B8%E5%8F%B0%E7%94%9F%E7%9A%84%E7%97%9B-215008638--finance.html

It is still the same example of mismatching of 行and 列to column and row if I didn’t get it wrong.

You got it wrong.

Rumyra commented 3 years ago

We made the decision at MDN last year, to not use automation for translations. For a full explanation see https://hacks.mozilla.org/2020/12/an-update-on-mdn-web-docs-localization-strategy/ in particular:

"Many folks we talked to said that automated translations wouldn’t be acceptable in their languages. Not only would they be substandard, but a lot of MDN Web Docs communities center around translating documents. If manual translations went away, those vibrant and highly involved communities would probably go away"

We're working on other improvements for all contributors and localizers, such as Markdown support. A feature for only a specific locale will not be a priority, so we'll be closing this issue permanently.