w3c / clreq

Requirements for Chinese Text Layout
https://www.w3.org/International/clreq/
Other
728 stars 61 forks source link

How to handle multisyllable/multiword ruby annotations in Chinese #125

Open r12a opened 7 years ago

r12a commented 7 years ago

In the examples shown in clreq all the word-based annotations, such as píngshéng or xiàohuà, have no space between the syllables. Is this always the case, or do you sometimes see the pinyin for a word as two (or more) syllables with space between?

The example that includes Kieth Emerson seems to attach those two names to separate runs of hanzi characters.

Are there cases (and presumably there are for bilingual annotations or interlinear comments) where more than one latin script word is included in a single annotation, with a space between? If such an annotation is short and appears over a long base text, what happens to the words?

Here i'm modifying the Keith Emerson example to show what i mean. There's a lot of base character width, and much less annotation width, and the one annotation includes two words.

Should they be set solid and centred like this: keithemerson-centred

Or should they be like the CSS space-around setting (leaving a large gap in the middle) (this is Firefox's default): keithemerson-saround

(I'm assuming that it would be wrong to equally space all the characters across the ruby text box, as you would for hanzi annotations.)

xfq commented 5 years ago

I think multiword ruby annotations should be set solid and centered.

ryukeikun commented 5 years ago

In the examples shown in clreq all the word-based annotations, such as píngshéng or xiàohuà, have no space between the syllables. Is this always the case, or do you sometimes see the pinyin for a word as two (or more) syllables with space between?

According to GB/T16159-2012 “Basic rules of the Chinese phonetic alphabet orthography”, the principle should be spelt by WORD (as in 4.2 and 5.1), so no need to add space between the syllables. Spelling character by character should be considered as special case (as in 7.1).

Most frequently used style of ruby annotations in China is "centered". So for the case of "Keith Emerson", it should be like this スクリーンショット 0001-06-05 11 01 56

which means, your first option is OK.

For your second option, I think it is just following JIS 1-2-1 rules but counting the "space" as a character. The JIS 1-2-1 rules was designed for ruby of kanas or ideographs, because defaults all the characters are in "square".

rubi_123

but when you use it for latin proportional annotations especially including word space character, JIS1-2-1 will render it like this, which is quite weird for Chinese readers. In this case, it is better to make the ruby of 2 words separately. スクリーンショット 0001-06-05 11 15 11

In JLreq, the note 3.3.1 c or Fig. 3.56 also explained:

when the base text or ruby text is Latin word, the word is set with western solid setting, and no inter-character space will be added to any ruby or base text in Latin characters no matter how different the ruby and base text look in length (see Fig. 3.56).

Anyway, our editors think "Centered" should be considered as principle method in CLreq.