Khmer Consonant Shifter bugs

marekjez86 commented 4 years ago

@sovichet:

This was copied from https://github.com/googlefonts/noto-source/issues/207

The Noto Khmer typefaces are having a shaping bug with the consonant shifter 'KHMER SIGN MUUSIKATOAN' (U+17C9). It happens in this case: Consonant + Consonant Shifter + Vowel Sign AA (17B6) + Nikahit (17C6).

I believe the consonant below is U+1789

68747470733a2f2f692e696d6775722e636f6d2f577837553471392e706e67

nizarsq commented 4 years ago

@marekjez86 I think #207 has some discrepancy. See https://github.com/googlefonts/noto-source/issues/207#issuecomment-650803620. I have generated test case, below is the output.

NorbertLindenberg commented 4 years ago

ញ is a second series consonant, which can be converted to first series using muusikatoan. Applying triisap to it is essentially a typo. Whether triisap should be converted to the below-base form in the context of a typo isn’t really defined, but it hides the typo and makes incorrect input look as if it were correct.

sovichet commented 4 years ago

@nizarsq the last typeface in your output is Khmer Sangam MN from Apple, not Khmer OS. Both Khmer Sangam and Khmer Sangam MN incorrectly substitute the register shift. Converting triisap to uMark in the case of ញ is not valid.

@NorbertLindenberg that could be the case, Norbert. I remember there was a discussion about it a long time ago. However, I don't think fonts should hide the typo when users entered a wrong input sequence.

nizarsq commented 4 years ago

@nizarsq the last typeface in your output is Khmer Sangam MN from Apple, not Khmer OS. Both Khmer Sangam and Khmer Sangam MN incorrectly substitute the register shift. Converting triisap to uMark in the case of ញ is not valid.

@NorbertLindenberg that could be the case, Norbert. I remember there was a discussion about it a long time ago. However, I don't think fonts should hide the typo when users entered a wrong input sequence.

Thats true it is Khmer MN not Khmer OS. My bad.

nizarsq commented 4 years ago

MakaraSok commented 1 year ago

Texts in red font face are not desirable. The PDF and ODT files of this screenshot are attached.

rendering issues J"A vs J:A.pdf

rendering issues J"A vs J:A.odt

simoncozens commented 1 year ago

OK, I am unsure about how to fix this. Are there any situations where the triisap should go below base?

@NorbertLindenberg do you have a handy UTN for Implementing Khmer? :-)

simoncozens commented 1 year ago

I guess the issue here is that the Unicode Standard says:

In the presence of other superscript glyphs, both of these signs are usually rendered with the same glyph shape as that of U+17BB khmer vowel sign u

This is what is happening in this font, so is TUS wrong?

r12a commented 1 year ago

It's not specific enough. Does this help? https://r12a.github.io/scripts/khmr/km.html#consonant_shift_posn

The Unicode Standard8 gives the impression that both of these diacritics are moved below the consonant any time a vowel appears over that consonant. However, in reality only certain consonants cause this behaviour. The behaviour varies a little by font, but in general ... TRIISAP is lowered for these characters.

(Read the subsection.)

simoncozens commented 1 year ago

Perfect, thank you.

NorbertLindenberg commented 1 year ago

@NorbertLindenberg do you have a handy UTN for Implementing Khmer? :-)

Not me, and not a UTN yet: https://www.unicode.org/L2/L2022/22290-khmer-encoding.pdf

This document discusses many of the flaws in the Unicode encoding of Khmer, and proposes a new encoding order. It includes a thorough discussion of consonant shifters. It was developed in collaboration with several Khmer government organizations, but has received lukewarm feedback from SAH/UTC: https://www.unicode.org/L2/L2023/23012-script-adhoc-rept.pdf https://www.unicode.org/L2/L2023/23083-script-adhoc-rept.pdf

notofonts / khmer

Khmer Consonant Shifter bugs #7