Open r12a opened 4 years ago
The first comment in this issue contains text that will automatically appear in one or more gap-analysis documents as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the document. Proposals for changes or discussion of the content can be made in comments below this point.
Relevant gap analysis documents include: _Bengali_
This issue is applicable to Bengali and Assamese.
The issues Letter-spacing splits conjuncts and Conjuncts are not selected as a single unit when styling initials describe how conjuncts should not be split by letter-spacing. See those issues for more details.
This topic builds on that for some specific cases in Bengali.
There are two cases in Bengali where hasant (virama) is preceded by an independent vowel, rather than a consonant. These are:
(In both cases this produces the sound æ, used for non-native words, such as 'application', 'administration' etc.)
This combination should not be split either, even though it doesn't fit the typical CvC structure of a conjunct (where 'v' is the virama).
Specs: css-text-3 CSS uses the concept of 'typographic character unit', rather than grapheme cluster, in its specs with the explanation that the cases just described go beyond the scope of the grapheme cluster concept and that implementations should provide appropriate support. The spec doesn't provide details about the support needed for each language.
Tests & results: Both of the following tests were run with the following pre-installed fonts:
Windows: Shonar Bangla, Arial Unicode MS, Nirmala UI, Vrinda
Mac: Bangla MN, Bangla Sangam MN, Kohinoor Bangla, Tiro Bangla, Baloo Da
Also tested with Noto Sans Bengali and Noto Serif Bengali on the Mac.
Interactive test, Bengali অ্যা and এ্যা (æ) are selected as a single grapheme by ::first-letter.
Note that Blink and Webkit actually handle the more usual CvC conjunct arrangement (see this test).
Interactive test, Bengali অ্যা and এ্যা (æ) are treated as a single grapheme for letter-spacing.
Gecko, Blink, and Webkit all fail to treat the sequence as a single grapheme, despite the fact that Blink and Webkit actually handle the more usual CvC conjunct arrangement (see this test).
Browser bug reports: Gecko • Blink • Webkit
Priority: Keeping such sequences together is a pretty basic requirement. That said, first-letter selection and letter-spacing are not essential for content authoring, although Bengali content authors should still have equal access to these styling features as Westerners. Content authors could work around the first-letter problem by adding markup (though that's not ideal), but for letter-spacing there is no real alternative, and adding spaces between letters ruins the semantics. The priority was set to advanced.