Open kojiishi opened 10 months ago
3 opinions in the Japanese discussion, all agreed:
@kojiishi https://github.com/w3c/csswg-drafts/pull/9503 tries to address this (and a some other things), can you have a look?
/cc @clqsin45 @nt1m
cc @fantasai @vitorroriz for WebKit input.
As pointed out in https://github.com/w3c/csswg-drafts/issues/9501#issuecomment-1779704918, we might have a related question about Bopomofo. By the way, how about Hangul? Maybe it should be treated similarly to Bopomofo?
Should these be made to fit in the ideographs
category, or non-ideographic letters
, or neither like half-width katakana?
What about other east asian scripts, such as:
All these have East_Asian_Width set to 'Wide', but not 'FullWidth'
how about Hangul?
@jungshik Please see the comment above. Would Korean readers want 1/8em auto-spacing between Hangul and Han Ideograph?
What about other east asian scripts, such as:
I would prefer not to go too far. This isn't a logical choice but just a pattern used for a long time, so the right answer doesn't exist, and updating the set from future feedback is likely possible.
@frivoal Khitan, Nushu, and Tangut should be categorized as ideographic. In fact, all Wide characters should be categorized this way. I don't think users would expect spacing between them and Han ideographs.
I agree with Khitan and Tangut (and maybe Jurchen and Classical Yi).
However, Nüshu is vastly different from Han characters, and its ideal character frame is rectangular, with the width of the character less than the height of the character. We will discuss this in the clreq group.
I'm no expert on Khitan Small Script, but i think it is a bit special since characters are arranged in 2-dimensional groups, separated by spaces. Here's a sample (the blue highlight shows a space):
https://r12a.github.io/scripts/samples/index.html?script=kits
The Khitan small script sequences do have spaces between them, and the spaces are there to terminate the words.
The problem with the Noto font, though, is that it deviates from the aesthetic and design of the script. The width of the top part and the bottom part should be the same. The following screenshot is a correct example:
If there is a zero width space or no space between between runs of text in Khitan small script and non-ideographic letters/numerals, the layout engine should add the extra spacing. If there is already a space (U+0020) between the text in Khitan small script and non-ideographic letters/numerals, there is no need for the layout engine to add extra spacing.
FWIW, there is also a proposal in Unicode about this: https://www.unicode.org/L2/L2023/23283-auto-spacing-prop.pdf
From the 8.5.2. Text Spacing Character Classes, halfwidth Kana belongs to "non-ideographic letters". This means that there will be an auto-space between ideographs and halfwidth Kana. This seems like an overlook.
The question is whether they should be "ideograph" or not. Terminology-wise, Kana is derived from Han, so it's closer to ideograph than to non-ideograph. Behavior-wise, I don't think authors expect auto-space between ideograph/fullwidth Kana and halfwidth Kana.
So I think we should classify them not to belong to either "ideographs" nor "non-ideographic letters". I'll check with JLTF about this separately.