n8willis / opentype-shaping-documents

Documentation of OpenType shaping behavior
170 stars 13 forks source link

[Indic] Disallow matras after "Consonant, Halant, ZWJ" sequences #72

Closed adrianwong closed 2 years ago

adrianwong commented 5 years ago

Sometime last year, HarfBuzz reverted a change from 2012 which permitted matras to occur after a "Consonant, Halant, ZWJ" sequence.

The relevant discussion can be found here. The gist of it is that a dotted circle should be inserted between the ZWJ and the matra to indicate that said matra doesn't have a valid base.

n8willis commented 3 years ago

Great catch; with the improved dotted-circle logic in these docs, this is a clear case for updating the regexes and noting that the no-dotted-circle-insertionism is a Uniscribe "quirk."

PR #142 alters the regular expressions and adds the Uniscribe behavior to the compatibility notes. Based on the HB discussion, it looks like the actual uncaught sequence in Uniscribe is "half_form,half_form,matra".

The change to the regular expressions is small, and only affects the vanishingly-unlikely sure-to-be-not-a-real-word condition above. But, of course, any pair of eyes looking it over would be welcome!