adobe-fonts / source-han-sans

Source Han Sans | 思源黑体 | 思源黑體 | 思源黑體 香港 | 源ノ角ゴシック | 본고딕
Other
14.48k stars 1.3k forks source link

Issue with U+5E90(庐) #371

Open KathrynCG opened 1 year ago

KathrynCG commented 1 year ago

图片 I think the glyph(CID16909) is unnecessary because U+5E90 isn't in JIS0212, JIS0213, Adobe-Japan1 or Adobe-KR9.

Marcus98T commented 1 year ago

Related comments for reference (text relating to 庐 (U+5E90) is in bold):

Some dot strokes become a straight line in the HK/TW version of the font, which is inconsistent with the Unicode standard. They're basically characters used in simplified Chinese, but regardless having no JP/KR source they show in JP/KR style. For example, U+4EB2 亲 and U+5E90 庐:

image

I know these are in CNS 11643 Plane 3 and are not present in HKSCS, so (as you would've said that they're "outside the Traditional Chinese scope of this project"), but they seem just inconsistent with other characters with such components in HK/TW standard.

On the other hand, U+5E9D 庝, being in CNS 11643 Plane 3, without JP/KR source and not present in HKSCS, the dot is preserved -- rather, the glyph shares among all 5 fonts:

image

I would like to hope this be an exception to the "out-of-scope" policy and a glyph shared by HK/TW font be created with the dot replacing the vertical bar.

Originally posted by @S-Asakoto in https://github.com/adobe-fonts/source-han-sans/issues/204#issuecomment-562414389

@S-Asakoto I think it isn't very accurate to say that “dot strokes become a straight line” for the characters you mentioned. A better way to say it would be the characters are considered “out of scope” so adhering to the corresponding regional convention is not guaranteed.

As you know, the scope of Traditional Chinese for Taiwan in this project is limited to the characters defined in Big5 (i.e. CNS 11643 Planes 1 & 2), which means that only Big5 characters are guaranteed to adhere (mostly) to Taiwan MoE's conventions in the TW version. But the codepoint coverage of Source Han Sans is not limited to Big5, so it is very easy to a character beyond Big5 not adhering to Taiwan MoE's standard.

It looks to me that the developer tries to find the best-match from other regions for any out-of-scope characters. Take 擵 as an example:

U64F5

擵 has two glyphs, namely uni64F5-JP (for JP region) and uni64F5-CN (for CN). This character falls beyond Big5, so there is no dedicated glyph for TW or HK, and unfortunately none of the existing glyphs conform to TW standard (the JP glyph uses a straight line for the dot; in the CN glyph 𣏟 is unified with 林. The developer decided to map the JP glyph but not CN one for TW, probably because the difference in the “𣏟” component is more apparant, so the JP glyph is considered the closest match.

Now, back to the three characters you mentioned.

threechars

U+4EB2 亲: JP and CN glyphs exist. JP is chosen for TW, probably because the design difference in 木 is more apparant (which I agree).

U+5E90 庐: JP and CN glyphs exist. JP is chosen for TW, probably because the design difference in 戶 is more apparant (which I also agree for TW. But HK should use CN glyph instead. It isn't the case probably due to historical reason that a separate HK version didn't exist before v2.000. Before that, TW was named TWHK, so the same mapping for out-of-scope characters was used).

U+5E9D 庝: Only CN glyph exists. There is no choice but maps all other regions to the CN glyph. Since the design of the “dot” is the same for CN and TW, the effect is that 庝 adheres to MoE standard even though it isn't in the supported range.

So yes, the dot is kind of “preserved” for U+5E9D 庝, but this is purely coincidental. Same for why U+5E0D 帍 adheres to MoE standard (戶's first stroke in 丿) but U+5554 啔 and U+623B 戻 doesn't even though all these are out of scope - because U+5E0D 帍 has a JP glyph to map from, and coincidently the JP design of 帍 is the same as (or very close to) that required by TW.

4ex

Originally posted by @tamcy in https://github.com/adobe-fonts/source-han-sans/issues/204#issuecomment-562813917

For me personally I would like Adobe to keep the JP glyph, because that character is also outside of CNS Planes 1 and 2 (aka Big5) as mentioned in the comments I quoted, and the JP glyph is there because it's the closest to the handwriting CNS Plane 3 reference. But HK should remap to the CN glyph for consistency sake.

However, if there's still no space even after merging the unnecessary regional components, and the JP glyph must be removed in the next release of Source Han Sans, then CN will be the only glyph for all regions, and that would probably be fine as well because 庐 (U+5E90) is not in any basic Japanese, Korean, Taiwanese or Hong Kong standards, just basic Mainland Chinese standards (GB 2312).