n8willis / opentype-shaping-documents

Documentation of OpenType shaping behavior
170 stars 13 forks source link

[Indic] REPH_MODE_EXPLICIT standalone Reph #81

Closed adrianwong closed 2 years ago

adrianwong commented 5 years ago

Both (newer) Telugu and Sinhala have the REPH_MODE_EXPLICIT characteristic. Both scripts' specifications state (in section 2.6) that:

an initial "Ra, Halant, ZWJ" sequence will always become a "Reph" unless the "Ra" is the only consonant in the syllable

This matches OpenType's specifications. In practice however, Uniscribe, HarfBuzz and CoreText all allow the Sinhalan "Repaya" to be formed when "Ra" is the only consonant in the syllable.

Is this the correct behaviour to expect from REPH_MODE_EXPLICIT scripts? Unless I've grossly misunderstood @lianghai, this post implies that it is, for Sinhala at least.

Is this true for Telugu also? Practically speaking, is there even a need to be concerned with it, given that modern Telugu text supposedly rarely uses "Reph"?

n8willis commented 3 years ago

I haven't had any luck finding answers for the Telugu-specific side of this, but HarfBuzz allows standalone explicit Reph in Telugu and Liang's logic about explicit-sequence characters taking higher precedence makes conceptual sense to me not just for Sinhala but elsewhere. Even if somebody doesn't like it when they encounter it, it's defensible.

Since these are the only two REPH_MODE_EXPLICIT scripts, there's also not a lot of value in having them diverge on the one point where they're defined to be in the same class on.

And, as you mention, since the explicit Reph is only there to support old orthography, it's not going to interfere with a lot of people's daily life.

So #138 changes that for both scripts (and drops in some fixes about the explicit Reph to the Indic-General doc, which probably ought to go in no matter what). I decided to just leave the question of "should you insert a dotted circle" out entirely, lest that further complicate matters for future new readers; AFAICT that's really the only remaining unresolved question. I can't imagine that special-casing dotted-circle handling for the Ra,Halant,ZWJ sequence would be worth it, particularly for an old orthography. If it turns out that the Telegu literary community really does want that, I suppose it could be added, but in the absence of further input I think those users could be expected to manually insert a dotted-circle.

Still, I would as always be interested in any feedback from Telugu readers — and, in this case, from Sinhala readers as well.

n8willis commented 2 years ago

Considering this closed by #138; anyone who finds later concerns in Telugu please do open a new issue for them.