w3c / iip

Documenting gaps and requirements for support of Indic languages on the Web and in eBooks.
https://w3c.github.io/iip/
9 stars 15 forks source link

Conjuncts are not selected as a single unit when styling initials #111

Open r12a opened 4 years ago

r12a commented 4 years ago

This issue is applicable to most languages that form conjuncts from consonant clusters using an invisible virama.

Because of the problems associated with grapheme cluster boundaries (see above), first-letter selection in CSS doesn't work well for conjuncts. For example, chrome fails to style the whole conjunct in પ્રૌદ્યોગીકી when using ::first-letter in a selector, and styles only the પ્ instead of પ્રૌ. Similar is the case of Internet Explorer, it only styles પ. This is problematic for many words in a script such as Gujarati, and forces the content author to use explicit spans rather than the proper mechanism for selecting initial letter.

Indian Layout Requirements provides a grammar for indian orthographic syllable boundaries which works for Gujarati, and CSS uses the concept of 'typographic character unit', rather than grapheme cluster, in its specs with the explanation that these cases are beyond the scope of the grapheme cluster concept and that implementations should provide appropriate support. In addition, a modification to the concept of grapheme cluster is currently in development at the Unicode Consortium, which is likely to resolve the problem for a script like Gujarati.

In addition, the alignment of styled initial-letter character glyphs with the rest of the text is not clearly specified or implemented.

For more details, see this GitHub issue, which is being used to track this gap.

r12a commented 4 years ago

The first comment in this issue contains text that will automatically appear in one or more gap-analysis documents as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the document. Proposals for changes or discussion of the content can be made in comments below this point.

Relevant gap analysis documents include: _BengaliDevanagariGujarati_