Pre-reordering discretionary features

tiroj commented 3 years ago

This issue relates to Indic scripts and others in which shaping involves a glyph reordering stage that is based on output of pre-reordering features, and in particular the reordering of reph and ikar dependent vowel relative to an explicit virama, i.e. the Indic2 shaping model for Devanagari etc.

The model currently used in Indic2 fonts, uses the pre-reordering cjct feature (and optionally akhn) to form conjunct ligatures that involve virama in the input sequence, e.g. for traditional vertical conjunct forms of retroflex consonants such as ड्ड or द्द which do not have half form variants. Then the post-reordering pres feature is used to form conjunct ligatures from half form sequences (i.e. using output from the pre-reordering half feature. The purpose of this split in formation of conjunct ligatures is to enable shaping engines to re-order reph and ikar relative to an explicit virama not swallowed by the akhn and cjct.

A persistent limitation of this model is that we have no easy way to apply stylistic behaviour variation at the feature level that would enable users to activate global preferences in text presentation around use or non-use of vertical conjuncts. Many modern Hindi users are more familiar with retroflex consonant conjuncts being displayed in horizontal layout with explicit virama (halant), and some Bureau of Indian Standards documents specify this form for Hindi. But in the present Indic2 model this distinction is only reliably made by use of the ZWNJ formatting control character to force horizontal display with explicit virama: ड्‍‌ड द्‌द‌. This requires either user knowledge and deliberate interaction in each affected conjunct sequence or document editing with search/replace or similar method to systematically insert formatting control characters in desired places.

It would be preferable, since this is a stylistic display matter, for users to be able to set conjunct display preferences at the glyph processing feature level, as they can for other kinds of stylistic display variants. However, this seems not possible at present because discretionary typographic features such as the Stylistic Set features, which would seem appropriate for this purpose (the modes of conjunct display map fairly readily to a small number of discrete sets), are all processed post-reordering. This produces problems regardless of the default conjunct display in a given font:

If the default display is a cjct ligature, breaking that ligature apart in a discretionary feature—into a horizontal sequence with explicit virama—post-reordering will result in inconsistent ordering of reph and ikar, since these will already have been ordered, respectively, towards the end and beginning of the ligature cluster.

If the default display is a horizontal sequence with explicit virama, reordering will have left a reph and/or ikar in the middle of the sequence, preventing later ligation in a discretionary feature.

There are complex methods that could be used to resolve either of those problems—the word ‘hack’ seems appropriate here—, in which contextual rules and careful mark filtering could be used to reorder glyphs in sequential GSUB manoeuvres. And given the ‘stability’ of a lot of OTL shaping implementations—stable to the point of petrification—it may be that such methods are considered the most viable option. But I would like to at least explore the idea of pre-reordering discretionary features. I understand that there is an extra level of care that would need to be taken with lookups for such features, to ensure that they do not interfere orthographic unit shaping features in a way that would break reordering, but I don’t think there is any a priori reason they should not be permitted.

NorbertLindenberg commented 3 years ago

A similar problem exists for Kawi, a Brahmic script that has been approved for a future version of the Unicode Standard based on the proposal L2/20-284R. In this script, the conjunct form of the consonant ra as second or later consonant in a cluster can take on a pre-base shape or a below-base shape, depending on the presence of other marks in the cluster (see page 6 of the proposal). There are no clear-cut rules for when to use which shape, and usage may vary even within one inscription. For a precise transcription, scholars may want to control the shape using a stylistic set or other discretionary feature.

The Universal Shaping Engine, which will handle Kawi in OpenType, proposes the following order for the operations involved: 1) Apply the pref feature to identify the conjunct form as a pre-base glyph, if that’s desired. 2) Apply the cjct feature to create the conjunct form from virama and consonant ra. 3) Reorder the glyph before the base consonant, if it has been identified in step 1 as pre-base. 4) Apply a discretionary feature such as a stylistic set to let the user override the font’s default choice for the shape of the conjunct form.

Step 1 now is in the tricky situation of having to make a decision for which information won’t be fully available until step 4, about a glyph that won’t be introduced until step 2. The latter problem can be worked around by using an earlier feature such as ccmp to create the conjunct form. The former problem, however, is harder because of the lack of an early discretionary feature, as John described.

tiroj commented 3 years ago

Excellent example. Thank you, Norbert.

Perhaps the most generalised way to describe this problem is to say that there are discretionary, stylistic forms that affect ordering.

NorbertLindenberg commented 3 years ago

Note that this is an OpenType-specific problem, not a general shaping problem. Apple Advanced Typography doesn’t constrain the order in which a font applies discretionary features within its morx table. I was able to implement a stylistic set to control the shape of the conjunct form in my Kawi prototype font (which is implemented using AAT), using a morx subtable that’s applied before the subtable for reordering the pre-base conjunct form. Testing in several apps and browsers was successful.

tiroj commented 3 years ago

Yes, it is a feature of the OTL model in which ordering is an engine operation at a fixed point in the processing of features rather than a font operation.

khaledhosny commented 3 years ago

This is an OpenType self-inflicted problem and could be fixed by applying all the lookups in their font order, instead of applying them in groups with predefined order, an opportunity that USE missed.

IIUC, this is what early OpenType implementation did, but some fonts had the wrong lookup order and instead of fixing the fonts it was decided that the engine would use a predefined order based on feature tags, and this decision was carried over since then.

NorbertLindenberg commented 3 years ago

Applying all lookups in font order would likely require some way for the font to indicate where to insert the magic steps of OpenType: split vowel decomposition, dotted circle insertion, reordering.

khaledhosny commented 3 years ago

That is something I haven't considered, to be honest, since I don't work with scripts require reordering or this kind of decomposition and I'm not qualified to make any suggestions. Though I note there is/was some movement for supporting reordering in OpenType lookups (by allowing many to many substitutions IIRC), so moving this logic to the fonts might be one way to do it (but again I don't know anything so this might be complete nonsense).

devosb commented 3 years ago

OpenType shapers also reorder a sequence of RA, VIRAMA that occurs at the beginning of a syllable in Devanagari, Bengali, and some other Indic scripts into a repha form which (at least for Devanagari and Bengali) occurs visually towards the end of the syllable. So to do the reordering you need to understand the syllable structure, which is more complicated that just swapping a left vowel from logical (as encoded) to visual (for displayed glyphs) order.

tiroj commented 3 years ago

A mechanism could be as simple as completion of the last lookup associated with a particular feature (e.g. cjct)—or with the last of some explicitly pre-reordering features—triggers the reordering action. That would enable lookups for discretionary features to be ordered before the reordering action and processed simultaneously with the pre-reordering feature lookups. In other words, rather than having discrete pre- and post-reordering features, have only discrete pre-reordering features, completion of which triggers the reordering.

w3c / font-text-cg

Pre-reordering discretionary features #49