Closed r12a closed 6 years ago
The Working Group just discussed Upright orientation involves more than just glyph orientation ttml2#281
, and agreed to the following resolutions:
RESOLUTION: Change "glyph" to "glyph area" in the quoted text.
[i passed this by Fantasai for a sense check before posting, and she said LGTM]
I traced the links again, but still find the definition of 'glyph area' very vague. Given https://github.com/w3c/ttml2/issues/236#issuecomment-275459765 i see it as meaning a grapheme cluster such as é (when decomposed) or a tamil conjunct, but also representing a whole word in joined up Arabic. The latter is odd.
What if the arabic word contains one or more letters that don't join on the left side, eg. التدويل?
What about northern indic scripts such as devanagari, where a top line joins most of the characters in a word, in a similar way to the Arabic joining, eg. अंतर्राष्ट्रीयकरण ?
The latter example is relevant here. Although one could argue that upright arabic text is rare, upright devanagari text is less so (see for example https://github.com/w3c/type-samples/issues/52). The important point in the devanagari example just pointed to is that the word is not simply split at letter boundaries - it is split at syllable boundaries (which in that particular case coincide with grapheme cluster boundaries).
I think it may be time to define a glyph area as corresponding to a 'typography character unit' as defined at https://drafts.csswg.org/css-text-3/#typographic-character-unit – which equates to a grapheme cluster generally, though perhaps covers more for some complex conjuncts (of which there are many in indic scripts).
Btw:
Glenn: You would never set arabic in upright form as he describes there...
It's likely that this is not at all common (we are trying to ascertain whether it might be more common for Uighur), but see https://w3c.github.io/alreq/#h_vertical_upright for a picture showing it (and following the CSS rules).
Glenn: Actually there's a language in one of the Maldive islands that uses arabic letters only in their isolated form to write their language.
You are perhaps referring to Dhivehi written in the Thaana script. For more information see http://r12a.github.io/scripts/thaana/
For the record, there is one occurence of grapheme cluster in TTML2 ED (as of today, commit 492604f), in the <emphasis-style>
definition
I traced the links again, but still find the definition of 'glyph area' very vague.
For the record, XSL 1.1 defines glyph area as follows:
A glyph-area is a special kind of inline-area which has no child areas, and has a single glyph image as its content.
glyph image is not defined.
XSL 1.1 also says:
The most common inline-area is a glyph-area, which contains the representation for a character (or characters) in a particular font.
[Meeting 2018-02-15] The WG has resolved not to expand the definition of "glyph area" further, nor to adopt "grapheme cluster" or "typographic character unit", but notes that all three concepts may be coincident from an implementation perspective. The group is willing to revisit this later.
10.2.46 tts:textOrientation http://w3c.github.io/ttml2/spec/ttml2.html#style-attribute-textOrientation
There are additional things to bear in mind here. The treatment as strong left-to-right will put arabic script characters in the correct visual order down the vertical line, but it should also be said that the characters should use the isolated form. Furthermore, the rotations should be applied to groups of glyphs that constitute a grapheme cluster, so that for example indic syllables remain together. (Although some consonant clusters are not fully encompassed by grapheme clusters even, in scripts like devanagari.)
I think there's some wording to this effect in the CSS spec that you could look at.