n8willis / opentype-shaping-documents

Documentation of OpenType shaping behavior
170 stars 13 forks source link

[Khmer] Undefined non-terminals in syllable regexp #128

Closed adrianwong closed 3 years ago

adrianwong commented 3 years ago

We have:

MATRA_GROUP = Z? M N?
SYLLABLE_TAIL   = (SM SM?)?

where M and SM are not defined.

n8willis commented 3 years ago

Yeah; I was reading up on the syllable structure last week and noticed that. Almost certainly _M_ is meant to be _matra_ and _SM_ meant to be _syllablemodifier_ (both from the identification classes above). I'll double check, but that should be a simple fix.

The bigger concern, as in #126, is that there are four (at least) different upstream regex definitions for Khmer syllables. Obviously the W3C text/community group is working on a way to iron out the inconsistencies, but I'm opening a standalone issue here to have a birds-eye-strategy discussion (for these docs) in one place.

n8willis commented 3 years ago

Fixed via 068b1f4.

As per last week's W3C-cg Khmer meeting, whatever other issues there might be to resolve the regular expressions, consensus is clear that one dependent vowel and a max of two modifier signs is the limit, so this change should be stable.