n8willis / opentype-shaping-documents

Documentation of OpenType shaping behavior
170 stars 13 forks source link

[Indic] Pre-base matra reordering exceptions for Malayalam and Tamil #68

Closed adrianwong closed 4 years ago

adrianwong commented 5 years ago

Malayalam "Consonant, Halant, ZWJ" sequences form Chillus; Tamil "Consonant, Halant, ZWJ" sequences form ligated explicit Halant forms.

There should be a note in both specifications to say that pre-base matras should be moved after these glyphs.

lianghai commented 5 years ago

There should be a note in both specifications to say that pre-base matras should be moved after these glyphs.

Yes, for chillus, because they’re usually written syllables on their own (except when they’re combined with following consonant signs…).

But what is a Tamil “ligated explicit Halant form” though? How is a halant form ligated?

adrianwong commented 5 years ago

But what is a Tamil “ligated explicit Halant form” though? How is a halant form ligated?

My understanding is that pre-base Tamil "Consonant, Halant, (ZWJ)" sequences can be ligated to form a composite glyph that, in appearance, is no different than if the sequence wasn't ligated. This occurs in some of the Tamil fonts we've tested against.

Relevant HarfBuzz comment can be found here.

n8willis commented 5 years ago

@adrianwong Can you provide a sample sequence/image of the Tamil ligated-explicit-Halant form?

Although I guess if the ligature is indistinguishable from the unligated form, it might not be too illuminating....

adrianwong commented 5 years ago

Although I guess if the ligature is indistinguishable from the unligated form, it might not be too illuminating....

Haha, this is true! Let me try a different approach...

Head over to this site and load up the latest NotoSansTamil-Regular.ttf font. You'll see that glyphs 77-99 are composite glyphs of glyphs 18-40 with an added Halant (glyph 52), formed via ligature substitution.

n8willis commented 4 years ago

Based on @lianghai 's comment, it sounds to me like:

  1. the intent in Unicode is that the Halant blocks the reordering (as a rule), which in turn means that...

  2. for the Tamil case, as noted here, the expectation is that...

  3. the font is using ligation for stylistic presentation reasons and the ligature still includes the Halant.

i.e., if the Tamil ligature did not show the visible Halant, then Unicode's R16 would expect the pre-base matra to be to the left of the ligature. (In this case, it being a moot point that the ligature sounds like it's wrong). However, the shaping engine can't actually know if the Tamil ligature involves a visible Halant, since that's potentially a question left up to human eyes.

Therefore, encoding this as a spec rule is drawing on an assumption that fonts are not being pathological.

n8willis commented 4 years ago

I believe this to be fixed now through #98. Feel free to reopen if necessary.