w3c / clreq

Requirements for Chinese Text Layout
https://www.w3.org/International/clreq/
Other
712 stars 61 forks source link

Feedback on Chinese Layout Requirements Links (Draft) #619

Open eisoch opened 3 months ago

eisoch commented 3 months ago

在文档 Chinese Layout Requirements Links 中的第2章Chinese Script Overview,有些问题值得考虑修正。

  1. “In its 'main' category, CLDR lists 2,210 characters for the Simplified Chinese orthography, and 2,180 for Traditional Chinese. Combined, this includes 3,026 unique characters, and an overlap of 1,064 characters. A working set of characters for modern Chinese may include 3 times this number, and the Unicode Standard includes approaching 100,000 Han characters, many of which are archaic or esoteric.” 一般而言,中国大陆、台湾地区、香港特区等地区都有其专门的规定。 中国大陆,参考《通用规范汉字表》一级字表,共3500个。可以参考Unihan DB中的kTGH属性。 台湾地区,参考新版的《常用“国字”标准字体表》,共4808个。Unihan DB中暂时缺少相关属性,若有必要,可以增加。 香港特区,参考《常用字字形表》(九七年修订本),共4759个(不含异体字)。Unihan DB中有kHKGlyph属性,但其中仍有写需要修正的问题。 澳门特区尚未发布其相关字表。 另外,新加坡曾发布过一些小学的字表,如《〈欢乐伙伴〉小学华文生字表》《〈欢乐伙伴〉小学高级华文生字表》等,但不清楚具体的用字数量。

  2. “Chinese has no combining marks,” 在大学中文系的课本、戏曲(尤其是昆曲)等地方,经常需要用到U+302A~U+302D四个表示声调的combining mark。严谨起见,可以加上commonly、in general use等字眼。

xfq commented 3 months ago

Discussions in yesterday's meeting: https://www.w3.org/2024/05/08-clreq-minutes.html#t01

xfq commented 3 months ago

cc @r12a

xfq commented 3 months ago

Trying to translate the first comment:

In 2. Chinese Script Overview of the document Chinese Layout Requirements Links, there are some issues worth considering for correction.

“In its 'main' category, CLDR lists 2,210 characters for the Simplified Chinese orthography, and 2,180 for Traditional Chinese. Combined, this includes 3,026 unique characters, and an overlap of 1,064 characters. A working set of characters for modern Chinese may include 3 times this number, and the Unicode Standard includes approaching 100,000 Han characters, many of which are archaic or esoteric.”

Generally speaking, Mainland China, Taiwan, Hong Kong SAR and other regions have their own special regulations.

For Mainland China, we can refer to the Tier I of the Table of General Standard Chinese Characters (Chinese: 通用规范汉字表), with a total of 3,500 characters. We can also refer to the kTGH property in Unicode Han Database.

In Taiwan, please refer to the new version of the Chart of Standard Forms of Common National Characters (Chinese: 常用國字標準字體表), with a total of 4,808 characters. The relevant property is currently missing in the Unicode Han Database and can be added if necessary.

For Hong Kong SAR, refer to the List of Graphemes of Commonly-Used Chinese Characters, 1997 version (Chinese: 常用字字形表). There is a kHKGlyph property in the Unicode Han Database, but there are still problems that need to be corrected.

The Macao SAR has not yet released its standard list of Chinese characters.

In addition, Singapore has released some character lists for primary schools, such as the "'Happy Buddy' Chinese Character List for Primary Schools" (Chinese: 《欢乐伙伴》小学华文生字表) and "'Happy Buddy' Advanced Chinese Character List" for Primary Schools" (Chinese: 《欢乐伙伴》小学高级华文生字表), etc., but the number of characters used is not clear.

“Chinese has no combining marks,”

In university textbooks (especially the Chinese language and literature major), Chinese opera (especially Kunqu), etc., it is often necessary to use the four combining marks U+302A - U+302D that represent tones. For the sake of rigor, phrases such as "commonly" or "in general use" can be added to this sentence.

xfq commented 1 month ago

PR in https://github.com/w3c/clreq/pull/627