mdn / yari

The platform code behind MDN Web Docs
Mozilla Public License 2.0
1.17k stars 500 forks source link

Create Traditional and Simplified Chinese conversion tools #4530

Closed kecrily closed 3 years ago

kecrily commented 3 years ago

Summary

There is no big gap between Traditional and Simplified Chinese, we can convert a source content from Traditional and Simplified Chinese to Traditional Chinese or Simplified Chinese by using the Traditional-Simplified tool, just like the Chinese Wikipedia.

Why

There are now two different branches of MDN zh, zh-cn and zh-tw. The translation results for the same document are roughly the same for both.

A Traditional Chinese user can read documents in Simplified Chinese almost without any problems. And vice versa.

The complete segregation of zh-cn and zh-tw translations results in unnecessary duplication of effort. We can avoid the waste of manpower by using a conversion tool. Let a translated content composed of both traditional and simplified Chinese be converted into an entirely zh-cn or zh-tw content by the tool.

What needs to be done?

Some project that might help

OpenCC

ccshan commented 3 years ago

In any circumstances that this idea is acceptable, we should keep the Traditional Chinese (the complex version) variant and treat it as a single source of truth. Then, we translate it into the Simplified Chinese variant.

The reason is pretty straightforward because only the complex things could be simplified but not vice versa.

But there are terms that are distinguished in zh-cn but not in zh-tw. For example, 支援 and 支持 are both used in both zh-cn and zh-tw, but the line between the two terms is drawn differently: you cannot blindly convert 支援 in zh-tw to 支持 in zh-cn, or you would get nonsensical results like 請支持收銀.

The words "traditional" and "simplified" are misnomers, or broad generalizations that don't hold on close inspection. I am happy with treating zh-cn and zh-tw as two different languages.

dryman commented 3 years ago

This is a very offensive task for traditional Chinese users, because It disrespect the culture developed in the traditional Chinese communities. Please do not proceed.

tomchentw commented 3 years ago

But there are terms that are distinguished in zh-cn but not in zh-tw. For example, 支援 and 支持 are both used in both zh-cn and zh-tw, but the line between the two terms is drawn differently: you cannot blindly convert 支援 in zh-tw to 支持 in zh-cn, or you would get nonsensical results like 請支持收銀.

The words "traditional" and "simplified" are misnomers, or broad generalizations that don't hold on close inspection. I am happy with treating zh-cn and zh-tw as two different languages.

Exactly!

Things have become easier when everyone could think in someone else’s shoes. ☺️

toppy368 commented 3 years ago

我是反對繁體或正體中文與簡體中文使用機械/程式轉換的方式轉換,因為這兩種語系對特定技術的單字不同,例如 USB flash drive ,正體中文語系習慣稱為 隨身碟 ,簡體中文則稱為 U盘 ,且更糟糕的地方是,有一些文字因為簡體中文的特性,使用機械轉換容易失敗且容易造成誤會,例如 乾燥 ,簡體字為 ,而且簡體字的 在不少地方也會使用,且容易轉換失敗,因此不建議直接使用機械翻譯方式。

類似的案例太多了,像是 土豆 (這也是中國影音網站的名稱),在台灣會被當成 花生 ,中國則當成 馬鈴薯 ,若因此造成誤會覺得不太好。

irvin commented 3 years ago

Here I would like to provide an example of the difference in translation practice of technology books between Simplified Chinese and Traditional Chinese.

This is the screenshot of the first page of "Algorithms to Live By: The Computer Science of Human Decisions" (a nice one!)

Left is Simplified Chinese book published by 中信出版集團 from China, on the right is Traditional Chinese version published by 行路出版社 in Taiwan.

截圖 2021-08-23 上午1 32 08

Let me took the text, converted the Simplified Chinese characters into Traditional characters, and compared the result,

截圖 2021-08-23 上午1 44 30

Obviously, they are almost totally different. For the whole page, only 67 chars are in the same sequence.

They also got a very different Chinese name:

This can further prove my experiences and opinion that Traditional Chinese and Simplified Chinese had different translation practices. The differences are not just on the terms, but also the dialogues out of different cultures backgrounds.

Yes, I'm sure that readers from China can read the Taiwan book and vice versa, but it's most easy for them to read and learn in their native way. A good book publisher don't simply convert the content to publish a book, and we shouldn't either.

kancheng commented 3 years ago

目前是正在中國大陸地區攻讀頂尖大學碩士的臺灣人。

就自己過去工作經驗而言,自己曾經參與多地區語系的開發。因為需求,系統會根據不同地區進行轉換,英式英語、美式英語、繁體中文-臺灣地區、簡體中文-大陸地區。 大陸、香港、澳門、臺灣,申請英國學校留學。即便是英式英文跟美式英文在細節上就有不少差異,在系統文本處理上就花費不少心思。那臺灣地區的繁體中文、大陸地區的簡體中文,在使用、習慣用語、專業術語上差距會更大。

就自己工作所遇到的英國人實際上在這方面非常注重,當然也有可能是學校機構的原因。

對大陸地區習慣的使用者面對臺灣人,他們會很習慣的說 "是的 !! 我看得懂繁體中文" ,但他們所認知的繁體中文跟臺灣地區用的繁體中文,很有可能在習慣用語上差異非常的大。再從大學數學課本上舉例,矩陣 (Matrix) 臺灣地區的習慣規則是直行橫列,大陸地區直接反過來。排列組合的數學表達也不一樣。

臺灣地區習慣若中文無法嚴格的解釋清楚,那我們會看原文英文書籍,讓自己可以掌握原本文件的本意,但大陸地區則每一個都要直翻成專有的簡體中文。當然臺灣地區的使用者也可以看簡體中文,但很多時候在不少關鍵語句,我們會花相對繁體中文文章更長的時間去閱讀。

另外維基百科的內容,多數自己會先以英文為準。若是中文到正式場合會再修飾過。而在臺灣,學生在重要作業引用維基百科的內容,會認為是偷懶與不負責任。

再多語系處理的實務經驗上,一份好且 "支持" 在地化的中文文本,會將分為大陸地區簡體中文、臺灣地區繁體中文。很有可能為了符合使用者習慣,再撰寫好後,針對不同地區發布前會交由該習慣地區的人,進行用語與語句不斷的修正。(大陸、香港、臺灣、澳門)

單純的簡單轉換只適合在簡單的語句網頁。而描述技術文件若使用簡單的線性轉換,在開發面對使用者的實務上會造成嚴重的災難。

另外再英式英文與美式英文的部分,要注意一些場合下,在一些領域的英國人面前使用錯誤的英式英語或者誤用成美式英文,這麼做會冒犯到他們,或者被他們認為不禮貌。

目前是正在中国大陆地区攻读高校学硕的台湾人。

就自己过去工作经验而言,自己曾经参与多地区语系的开发。因为需求,系统会根据不同地区进行转换,英式英语、美式英语、繁体中文-台湾地区、简体中文-大陆地区。 大陆、香港、澳门、台湾,申请英国学校留学。即便是英式英文跟美式英文在细节上就有不少差异,在系统文本处理上就花费不少心思。那台湾地区的繁体中文、大陆地区的简体中文,在使用、习惯用语、专业术语上差距会更大。

就自己工作所遇到的英国人实际上在这方面非常注重,当然也有可能是学校机构的原因。

对大陆地区习惯的使用者面对台湾人,他们会很习惯的说 "是的 !! 我看得懂繁体中文" ,但他们所认知的繁体中文跟台湾地区用的繁体中文,很有可能在习惯用语上差异非常的大。再从大学数学课本上举例,矩阵 (Matrix) 台湾地区的习惯规则是直行横列,大陆地区直接反过来。排列组合的数学表达也不一样。

台湾地区习惯若中文无法严格的解释清楚,那我们会看原文英文书籍,让自己可以掌握原本文件的本意,但大陆地区则每一个都要直翻成专有的简体中文。当然台湾地区的使用者也可以看简体中文,但很多时候在不少关键语句,我们会花相对繁体中文文章更長的时间去阅读。

另外维基百科的内容,多数自己会先以英文为准。若是中文到正式场合会再修饰过。而在台湾,学生在重要作业引用维基百科的内容,会认为是偷懒与不负责任。

再多语系处理的实务经验上,一份好且 "支持" 在地化的中文文本,会将分为大陆地区简体中文、台湾地区繁体中文。很有可能为了符合使用者习惯,再撰写好后,针对不同地区发布前会交由该习惯地区的人,进行用语与语句不断的修正。 (大陆、香港、台湾、澳门)

单纯的简单转换只适合在简单的语句网页。而描述技术文件若使用简单的线性转换,在开发面对使用者的实务上会造成严重的灾难。

另外再英式英文与美式英文的部分,要注意一些场合下,在一些领域的英国人面前使用错误的英式英语或者误用成美式英文,这么做会冒犯到他们,或者被他们认为不礼貌。

flaing commented 3 years ago

amuzing. someone is making his points.

chianuo commented 3 years ago

This suggestion reeks of political motivations and China's attempts at cultural assimilation and domination of the Chinese speaking world, in particular Taiwan, a country that China still threatens with violence and military conquest.

It also seem that there is a poor understanding of the linguistic differences involved. I would like to echo and support the points made by @kancheng , @kuanyui , @t7yang , @irvin , et al.

This is a terrible and quite frankly offensive idea, please do not allow this.

milkcask commented 3 years ago

Zh-wikipedia-5wlogo

So why Wikipedia can do, Mozilla can't?

I'm a veteran Wikipedian and one of the original authors of Chinese Wikipedia's conversion pipeline (and had huge influence on i18n/l10n web standards) back in the 2000s. I believe we have made a mistake in creating that mechanism. Current arrangements on Wikipedia require active fine-tuning everyday (see https://zh.wikipedia.org/wiki/Wikipedia:字词转换 ), manual application of conversion themes (see https://zh.wikipedia.org/wiki/Wikipedia:字詞轉換處理/公共轉換組 ) and in-article overrides. It's a mess. It's simply impractical for smaller communities.

vkedwardli commented 3 years ago

There is no big gap between Traditional and Simplified Chinese.

Just FYR, a Hongkonger would have difficulty reading a zh-Hant-TW technical article (we use English terms a lot), where zh-Hans-CN is like always like an alien language (火星文).

zh-Hant-HK / yue-Hant-HK(粵文) exist for a reason.

thinkthink09 commented 3 years ago

lol, other companies work towards scalability and extensibility, while Mozilla wants limitation and constraints? Or is this a propaganda for something?

lydian commented 3 years ago

I don’t see any benefits on this proposal but instead just an approach to disrespect the culture of traditional Chinese readers. As a Taiwanese, if there’s no traditional Chinese version, I would prefer English rather than simplified Chinese documents, because it is so exhausted to translate so many different terms for me. I can see there’s a trend lead by PRC government to dominate whole Chinese speaking/related culture, and we are working really hard to against it. It’s really sad that our culture are being gradually killed by Google translate, and many mainstream tools, because those companies decides to use those terms from simplified Chinese world, whether they are intention or not.

Technology should be used to help everyone to understand and respect about different culture. But this issues is doing completely opposite. In the name of convenience to conquer another culture. This issue is very evil to me, whether the author is intended or not.

RJHsiao commented 3 years ago

@kecrily said:

Thank you to those who do not participate in the discussion, but will vote. They let us know that there is such a large group of people in the world who are concerned about us. Thank you to the two hundred users who suddenly appeared this afternoon who did good deeds (maybe?) without leaving their names.

So I leave my name here as you want. 😝 Let me say something to make me looks more conscientious on discussing: If someone says "Traditional/Simplified Chinese are almost the same, the difference is only on the character and vocabulary that can be convert one-by-one.", that mean you "just a little bit" understand about Chinese aka you totally don't understand about Chinese, even you are a Chinese and/or you can speak in Chinese.

@KevinZonda said:

So why Wikipedia can do, Mozilla can't?

The most important thing is that the premise you think is totally wrong: Wikipedia not really "can" do, they just done and let developers and editors face on the chaos made by THAT DECISION. You will know I mean if you ever edit a Chinese Wikipedia article that basically written with the "another" Chinese as the Chinese not your normal used.

LLLgoyour commented 3 years ago

This suggestion reeks of political motivations and China's attempts at cultural assimilation and domination of the Chinese speaking world, in particular Taiwan, a country that China still threatens with violence and military conquest.

It also seem that there is a poor understanding of the linguistic differences involved. I would like to echo and support the points made by @kancheng , @kuanyui , @t7yang , @irvin , et al.

This is a terrible and quite frankly offensive idea, please do not allow this.

You can disagree the idea of this issue but you don't have to conspire the region or regime of others, aka there's no need to associate the problem with politics.

你可以不同意這則 issue 作者的言論,但沒必要陰謀化一個地區或政權。討論這個問題不需要涉及到政治因素。

你可以不同意这则 issue 作者的言论,但没必要阴谋化一个地区或政权。讨论这个问题不需要涉及到政治因素。

minipai commented 3 years ago

Yeah, there's no need to associate the problem with politics. Attempt to erase a language itself already IS a political action.

yurenju commented 3 years ago

Thank you to those who do not participate in the discussion, but will vote. They let us know that there is such a large group of people in the world who are concerned about us. Thank you to the two hundred users who suddenly appeared this afternoon who did good deeds (maybe?) without leaving their names.

that is the worst decision we can make in this discussion. there is a very offensive example: imagine all people vote if gay can get marriage. we don't do that in real world, isn't it? voting is not a good option for deciding rights of minority.

gugod commented 3 years ago

Hi, another lurker here. :) In short I semi-support the incooperation of a conversion tool of some kind, but not the the same way as the author @kecrily proposed.

While I support the developement of conversion tools such as OpenCC as well as convension dictionaries, I would like to point out that you don't want the capabilitiy to the conversion tool to contribute to the quality of your documenrtation -- it took a long way for machine translators to finally not always producing jokes :)

We should, however, take the tool and make it a machine-aided writing tool for it could really help producing a lot of content -- having tools that can convert documentations from language A to language B in batch can really help bootstrap the volume of content in language B. Perhaps similar to what @peterbe drafted with those piece of pseudo code.

However, you want to make the machine editor as an aid/assistent, not as a boss that decide the final output as machine translations still contains errors. Unlike machines, we humans do not want errors. :)

While Wikipedia did choose to go with machine conversion and arguably made a "success" on boosting the volume (number of distinct titles) of their Chinese site, They implemented their own conversion tools and they've been paying the price of having to unbreak all the conversion errors made my machine, as we knew that it'll happen. There are a lot of cases that would make conversion goes wrong and the result would be a document with suboptimal quality which needs to be fixed by human editors -- to the extend that they extended the wiki syntax so that human editors can mark a region of text to be non-convertable -- which should be enough to demonstrate that there is an obvious ceiling. The tool exists and "works", but it does not work as nice as it would be in the Tower of Babel.

For most of their work in character/phrase conversion, follow: https://zh.wikipedia.org/wiki/Wikipedia:%E5%AD%97%E8%A9%9E%E8%BD%89%E6%8F%9B%E8%99%95%E7%90%86

To get a sense of how human editors suffer (if I may say), you could take a peek of how many conflcits there are at this page: https://zh.wikipedia.org/zh-tw/Wikipedia:%E5%AD%97%E8%AF%8D%E8%BD%AC%E6%8D%A2 -- it's not that the still have to resolve errors every day, but it's basically an endless task.

Unless we Mozililans are as comitted as Wikipedian editors in terms of perfecting the conversion tool (OpenCC / inhouse ) and its eco-system (how the tool is used in MDN), I feel that the price we pay for using conversion tool in the final satge of documentation is pretty much the same as if there is none, but just shifted to a completely different domain.

I would suggest to change the scope of this issue to make a tool that makes it easier for human editors to copy a page from langueag A, convert it to language B with the conversion tool, then start editing from there before commiting the result.

Merging language A and B together and letting machine dictates the end-result still feels like driving a double truck backword -- you only get a tiny mirror to see what's wrong and can only fix the result indirectly. I don't think this is what you want.

brianwchh commented 3 years ago

你確定你提供的簡體和繁體的轉化就是正確了?是權威?好好當你的機器人程序員,簡體和傳統正體是兩套語言,不要去充當什麼文字專家的角色! 我是用搜狗拼音打所謂的繁體字,但我作爲中國人完全不信任它給出的翻譯是100%正確的傳統正體漢字,還是要問經過傳統漢語教學的港臺人:這個鑽錢機器人程序員做的機器翻譯的對不對? 你可以提供一個方便溝通的翻譯工具(準不準確再說,不用細到文字學術範疇,不影響賺錢就可以了),但要試圖做個一統江湖的規範,你還沒這個資格呢! 傳統正體漢字的文化和我們大陸簡體洗腦的普通話在文化上的差異,可不是簡單的程序員理解的f(鼠标)=滑鼠這麼簡單! 所以完全不明白你們浪費時間在做這個所謂的規範有什麼意義! 電腦內存和硬盤也夠裝,比你專業的人力也夠!唯獨你這個規範是多餘的!而且作爲工具,我都覺得未必夠權威!就像用google翻譯英文到中文的書,沒人敢出版一樣!頂多只是個猜一猜其中意思的輔助!

ThinkerYzu commented 3 years ago

It is a serious misunderstanding that the differences between Taiwanese Mandarin and China's Mandarin are like what is happening between British English and American English. I don't read any technical articles translated from China's Mandarin to Taiwanese Mandarin if possible. It is even far more difficult than reading technical articles in English for me. Two Mandarins are so similar, but are also different enough to block my brain. The major problem is when you think you know what it is, but it is in different meanings.

Another problem is there is no any separator for terminologies. There is no perfect solution for segmentation. In English, there are spaces between words, but it is not there in writing Mandarin. Even you have a table to mapping terminologies, you still don't know the boundaries of terminologies in an article.

yurenju commented 3 years ago

@gugod's approach sounds good since we can make a tool to assist a human editor to not only boost translation speed but also have a quality translation.

LLLgoyour commented 3 years ago

你確定你提供的簡體和繁體的轉化就是正確了?是權威?好好當你的機器人程序員,簡體和傳統正體是兩套語言,不要去充當什麼文字專家的角色! 我是用搜狗拼音打所謂的繁體字,但我作爲中國人完全不信任它給出的翻譯是100%正確的傳統正體漢字,還是要問經過傳統漢語教學的港臺人:這個鑽錢機器人程序員做的機器翻譯的對不對? 你可以提供一個方便溝通的翻譯工具(準不準確再說,不用細到文字學術範疇,不影響賺錢就可以了),但要試圖做個一統江湖的規範,你還沒這個資格呢! 傳統正體漢字的文化和我們大陸簡體洗腦的普通話在文化上的差異,可不是簡單的程序員理解的f(鼠标)=滑鼠這麼簡單! 所以完全不明白你們浪費時間在做這個所謂的規範有什麼意義! 電腦內存和硬盤也夠裝,比你專業的人力也夠!唯獨你這個規範是多餘的!而且作爲工具,我都覺得未必夠權威!就像用google翻譯英文到中文的書,沒人敢出版一樣!頂多只是個猜一猜其中意思的輔助!

And the problem here truly reveals: no matter simplified or traditional Chinese, both of two have some differences at anywhere. Right now we're discussing a solution for this problem here, and it's not simply to hear someone pointing out "浪費時間在做這個所謂的規範有什麼意義". It'll be better if you can give some suggestions that really help make the conversion solution better.

lulalala commented 3 years ago

The author said that one should not just vote but instead show their name. I will try that (while trying to add something into the discussion)

Simplified Chinese user Traditional Chinese user
keep as is now happy mostly happy (the few who are unhappy have a choice)
makes the change happy unhappy (there is no choice)

Right now we're discussing a solution for this problem here

I think his point is exactly that we are creating an imaginary problem instead of solving them.

ccshan commented 3 years ago

Many people have commented here suggesting that nothing be done. If the original proposal would have made the world worse, then that is a concrete, actionable, constructive suggestion.

@gugod's proposal of a machine-aided writing tool might help zh-tw contributors produce more content by incorporating machine-converted material from a variety of languages, not just zh-cn.

brianwchh commented 3 years ago

你確定你提供的簡體和繁體的轉化就是正確了?是權威?好好當你的機器人程序員,簡體和傳統正體是兩套語言,不要去充當什麼文字專家的角色! 我是用搜狗拼音打所謂的繁體字,但我作爲中國人完全不信任它給出的翻譯是100%正確的傳統正體漢字,還是要問經過傳統漢語教學的港臺人:這個鑽錢機器人程序員做的機器翻譯的對不對? 你可以提供一個方便溝通的翻譯工具(準不準確再說,不用細到文字學術範疇,不影響賺錢就可以了),但要試圖做個一統江湖的規範,你還沒這個資格呢! 傳統正體漢字的文化和我們大陸簡體洗腦的普通話在文化上的差異,可不是簡單的程序員理解的f(鼠标)=滑鼠這麼簡單! 所以完全不明白你們浪費時間在做這個所謂的規範有什麼意義! 電腦內存和硬盤也夠裝,比你專業的人力也夠!唯獨你這個規範是多餘的!而且作爲工具,我都覺得未必夠權威!就像用google翻譯英文到中文的書,沒人敢出版一樣!頂多只是個猜一猜其中意思的輔助!

And the problem here truly reveals: no matter simplified or traditional Chinese, both of two have some differences at anywhere. Right now we're discussing a solution for this problem here, and it's not simply to hear someone pointing out "浪費時間在做這個所謂的規範有什麼意義". It'll be better if you can give some suggestions that really help make the conversion solution better.

your so called ambition to remove either one is not a good solution. there will be no good conversion tool between traditional and truncated / simplified Chinese, simply put this way, no one from China dare publish a book written via 搜狗繁體 input method in taiwan , because one feels uncertain in heart whether the so called translation is correct or not. so, should Chinese trust 搜狗 or openCCCCCCC more ? so my point is there is really no need for unnecessary duplicated and less trustful work of 搜狗

abbychau commented 3 years ago

I am used to trad. Chinese (which is my native language) and it is also my system language for more than 20 years. I work near-only in English, and sometimes Japanese; and I usually interact with the simplified Chinese community.

So I basically can read all the terms.

In my opinion, "the usage of terms" in different community is not a blocker leading to result of "not to convert characters" (I don't want to involve into the war of "to or not to convert terms). Indeed, one can read china terms like 程序、線程、鼠標 in traditional Chinese without problems, even if the reader could not know one of the terms, it won't cause a wrong meaning, but just an unknown, which is safe. The usage of terms is not necessarily binding the character set.

Microsoft even provided English-origin chinese versions in their MSDN. What is important is just to give a notice: "This document is machine translated. (xxx (link here) is the origin)", we can add something like "[link]help translating it " as well.

I noticed that there is an example of auto translation in Wikipedia. My opinion is, due to phrase splitting of Chinese cannot be determined without human knowledge, auto-translation from simp.chin to trad.chin is technically impossible(although sometimes practically doable), especially true when you want to involve terms and phrases.

The position of Wikipedia is to log the truth, but for MDN, I think giving a auto-char-translated version of Traditional chinese is helpful for my navigation of many of the documents; simply because I feel trad. chinese is much more comfortable as I grew up in this language and I can read it faster and it also gave me larger psychological comfort.

One may argue that a simple browser plugin could do the job, but I strongly believe that it is out of topic.

Dobatymo commented 3 years ago

It's good to have tools which aid manual conversion. Of course both scripts should be kept as they are different scripts and largely represent different cultures.

@peterbe From a computational point of view, it would only make sense to keep traditional only and not simplified only as only traditional can be converted to simplified losslessly but from simple to traditional is a complex tasks because ambiguities must be resolved.

dennischen commented 3 years ago

Dear Sir, Regardless the above opinions, This is kind of human respect issue, not machine coding issue. How about I asked you to skip Simplify Chinese, and just code mapping it by Trad. Chinese.

LLLgoyour commented 3 years ago

It's good to have tools which aid manual conversion. Of course both scripts should be kept as they are different scripts and largely represent different cultures.

@peterbe From a computational point of view, it would only make sense to keep traditional only and not simplified only as only traditional can be converted to simplified losslessly but from simple to traditional is a complex tasks because ambiguities must be resolved.

Just a thought. After convert sentences from simplified Chinese to traditional Chinese, use an engine such as Google's Neural Machine Translation to transform to a localized language structure, vice versa. It will be more efficient than aid manual conversion, and it can serve Chinese-reader well as they want to convert.

ccshan commented 3 years ago

it won't cause a wrong meaning, but just an unknown, which is safe

One counterexample: the rows and columns of a matrix.

only traditional can be converted to simplified losslessly

In fact, traditional cannot be converted to simplified losslessly either.

brianwchh commented 3 years ago

It is a serious misunderstanding that the differences between Taiwanese Mandarin and China's Mandarin are like what is happening between British English and American English. I don't read any technical articles translated from China's Mandarin to Taiwanese Mandarin if possible. It is even far more difficult than reading technical articles in English for me. Two Mandarins are so similar, but are also different enough to block my brain. The major problem is when you think you know what it is, but it is in different meanings.

Another problem is there is no any separator for terminologies. There is no perfect solution for segmentation. In English, there are spaces between words, but it is not there in writing Mandarin. Even you have a table to mapping terminologies, you still don't know the boundaries of terminologies in an article.

say you want to publish a book in Taiwan,would you trust 搜狗繁體 input method ? are you sure the conversion between simplified Chinese and the traditional one. I seriously doubt that anyone dare do that without professional consultation from a local person, if a native mainland Chinese speaker is not sure about the so called conversion tool, not to mention the confusion from foreigners. IT IS A RIDICULOUS IDEA TO REPLACE EITHER ONE. if there is really a need to remove one, I would prefer to remove the truncated version of Chinese Character which is proved to be a culture disaster, 改一種語言不是殺幾個人,拍下腦袋就可以決定的,如果那個圖書館看門的沒有當國家主席,你還會把殘體中文當聖經!?

dennischen commented 3 years ago

You can disagree the idea of this issue but you don't have to conspire the region or regime of others, aka there's no need to associate the problem with politics.

你可以不同意這則 issue 作者的言論,但沒必要陰謀化一個地區或政權。討論這個問題不需要涉及到政治因素。 你可以不同意这则 issue 作者的言论,但没必要阴谋化一个地区或政权。讨论这个问题不需要涉及到政治因素。

當你試圖移除掉某一個國家的文字的時候, 就是政治因素. When you trying to erase words, characters of a country, then it is a HUAGE politics issue. 美國人要移掉簡體中文用ai翻譯, 你大中國也都沒問題時, 你再來談這是不是政治問題, If AI can replace simplify Chinese, and you are OK with that, then we can back here to discuss this issue.

abbychau commented 3 years ago

it won't cause a wrong meaning, but just an unknown, which is safe

When one is reading 列 and 欄 , instead of coverting them into "column / row"(regardless of order), due to his/her education. One rather imagines that it is either property of a grid system.

e.g. https://sealnote.net/tech/excel%E7%9A%84%E6%AC%84%E5%88%97%E4%B8%80%E5%BC%B5%E5%9C%96%E7%A7%92%E6%87%82%E4%B8%8D%E5%BF%98/

misunderstanding of 列(in china) into "row", is totally acceptable(may not be acceptable in a technical doc), but in either way, with the notice of "it is converted from simply chinese", and with gradual correction helps from the community, this is not a big problem in practical.

only traditional can be converted to simplified losslessly In fact, traditional cannot be converted to simplified losslessly either. True. but it is non-argument.

ccshan commented 3 years ago

When one is reading 列 and 欄 , instead of coverting them into "column / row"(regardless of order), due to his/her education. One rather imagines that it is either property of a grid system.

I'm not sure what solution you are proposing. This may be in part because I'm having trouble understanding your English. Perhaps you are suggesting that all zh-cn users be re-educated in a certain way that does not match how they currently are used to talking about matrices? By the way, I am not talking about 欄

abbychau commented 3 years ago

When one is reading 列 and 欄 , instead of coverting them into "column / row"(regardless of order), due to his/her education. One rather imagines that it is either property of a grid system.

I'm not sure what solution you are proposing. This may be in part because I'm having trouble understanding your English. Perhaps you are suggesting that all zh-cn users be re-educated in a certain way that does not match how they currently are used to talking about matrices? By the way, I am not talking about 欄

my solution is just to keep 欄 as 欄 , 列as 列, column as column, row as row; even if the meaning behind is wrong.

komali2 commented 3 years ago

When one is reading 列 and 欄 , instead of coverting them into "column / row"(regardless of order), due to his/her education. One rather imagines that it is either property of a grid system.

I'm not sure what solution you are proposing. This may be in part because I'm having trouble understanding your English. Perhaps you are suggesting that all zh-cn users be re-educated in a certain way that does not match how they currently are used to talking about matrices? By the way, I am not talking about 欄

If your interpretation of their suggestion feels absurd, then you'll understand why others believe the OP's suggestion also feels absurd. Traditional character users shouldn't be obligated to be re-educated to use simplified "just because a lot of people use it."

LLLgoyour commented 3 years ago

When one is reading 列 and 欄 , instead of coverting them into "column / row"(regardless of order), due to his/her education. One rather imagines that it is either property of a grid system.

I'm not sure what solution you are proposing. This may be in part because I'm having trouble understanding your English. Perhaps you are suggesting that all zh-cn users be re-educated in a certain way that does not match how they currently are used to talking about matrices? By the way, I am not talking about 欄

my solution is just to keep 欄 as 欄 , 列as 列, column as column, row as row; even if the meaning behind is wrong.

Or do you mean keep them as "common knowledge"? since you mention "and with gradual correction helps from the community". I'm a little bit confused.

ccshan commented 3 years ago

my solution is just to keep 欄 as 欄 , 列as 列, column as column, row as row; even if the meaning behind is wrong.

Again, I am not talking about 欄, which in the MDN context probably means a field on a form. Neither zh-cn or zh-tw uses 欄 to refer to a part of a matrix.

Still, you seem to have come to agree that there are counterexamples to your earlier claim that "it won't cause a wrong meaning".

abbychau commented 3 years ago

my solution is just to keep 欄 as 欄 , 列as 列, column as column, row as row; even if the meaning behind is wrong.

Again, I am not talking about 欄, which in the MDN context probably means a field on a form. Neither zh-cn or zh-tw uses 欄 to refer to a part of a matrix.

Still, you seem to have come to agree that there are counterexamples to your earlier claim that "it won't cause a wrong meaning".

do you have one more to make it plural?

ccshan commented 3 years ago

my solution is just to keep 欄 as 欄 , 列as 列, column as column, row as row; even if the meaning behind is wrong.

Again, I am not talking about 欄, which in the MDN context probably means a field on a form. Neither zh-cn or zh-tw uses 欄 to refer to a part of a matrix. Still, you seem to have come to agree that there are counterexamples to your earlier claim that "it won't cause a wrong meaning".

do you have one more to make it plural?

Sure, it's the very next sentence in the original message I linked to: "排列組合的數學表達也不一樣。"

brianwchh commented 3 years ago

This suggestion reeks of political motivations and China's attempts at cultural assimilation and domination of the Chinese speaking world, in particular Taiwan, a country that China still threatens with violence and military conquest. It also seem that there is a poor understanding of the linguistic differences involved. I would like to echo and support the points made by @kancheng , @kuanyui , @t7yang , @irvin , et al. This is a terrible and quite frankly offensive idea, please do not allow this.

You can disagree the idea of this issue but you don't have to conspire the region or regime of others, aka there's no need to associate the problem with politics.

你可以不同意這則 issue 作者的言論,但沒必要陰謀化一個地區或政權。討論這個問題不需要涉及到政治因素。

你可以不同意这则 issue 作者的言论,但没必要阴谋化一个地区或政权。讨论这个问题不需要涉及到政治因素。

想說,談政治不是罪,也不恥辱,談政治是一種責任,不要總舉牌:只談風月,不談國事,這裡不是青樓!什麼都可以談!而且對岸的政權還需要人去陰謀化嗎?你是傻呢還是壞?裝清純可愛!中國的制度決定了每個中國人都是政治化的工具,光天化日壞事做的罄竹難書,別人懷疑也是正常!不懷疑的,才是傻白甜!

duncanhsieh commented 3 years ago

I'm a Traditional Chinese user, but I can't read documents in Simplified Chinese. Traditional Chinese has a complete meaning of words, so traditional Chinese should be considered as the only Chinese document.

abbychau commented 3 years ago

my solution is just to keep 欄 as 欄 , 列as 列, column as column, row as row; even if the meaning behind is wrong.

Again, I am not talking about 欄, which in the MDN context probably means a field on a form. Neither zh-cn or zh-tw uses 欄 to refer to a part of a matrix. Still, you seem to have come to agree that there are counterexamples to your earlier claim that "it won't cause a wrong meaning".

do you have one more to make it plural?

Sure, it's the very next sentence in the original message I linked to: "排列組合的數學表達也不一樣。"

I couldn't get it. Isn't it the same example of 直行橫列?

ccshan commented 3 years ago

I couldn't get it. Isn't it the same example of 直行橫列?

No. How did you learn to notate your 排列組合?

adaam commented 3 years ago

I don't think it needs to convert, if people can read Simplified Chinese and Traditional Chinese at the same time. There is no reason to convert it, right? There two language page just shows they are two different language. If people want to know its content, and can read Simplified Chinese and Traditional Chinese at the same time, they will choose what they can understand to read.

abbychau commented 3 years ago

I couldn't get it. Isn't it the same example of 直行橫列?

No. How did you learn to notate your 排列組合?

I see. I learnt Math in English. I suppose you are referring to Combination and Permutation.

How one of Chinese terms be ambiguous ?

ccshan commented 3 years ago

How one of Chinese terms be ambiguous ?

https://tw.news.yahoo.com/%E9%AB%98%E7%AD%89%E6%95%B8%E5%AD%B8%E8%BC%83%E9%9B%A3-%E5%A4%A7%E9%99%B8%E5%8F%B0%E7%94%9F%E7%9A%84%E7%97%9B-215008638--finance.html

kuanyui commented 3 years ago
tonyhhyip commented 3 years ago

Let me cite some journal and study about the difference of Traditional and Simplified Chinese in order to provide more information about the harmful this proposal.

abbychau commented 3 years ago

How one of Chinese terms be ambiguous ?

https://tw.news.yahoo.com/%E9%AB%98%E7%AD%89%E6%95%B8%E5%AD%B8%E8%BC%83%E9%9B%A3-%E5%A4%A7%E9%99%B8%E5%8F%B0%E7%94%9F%E7%9A%84%E7%97%9B-215008638--finance.html

It is still the same example of mismatching of 行and 列to column and row if I didn’t get it wrong.

huichiaotsou commented 3 years ago

If Simplified Chinese users can read Traditional, let's make all the documentation in Traditional Chinese 🤗 Because Traditional Chinese supports more characters and so the clarity of sentences will be much more precise.