w3c / iip

Documenting gaps and requirements for support of Indic languages on the Web and in eBooks.
https://w3c.github.io/iip/
8 stars 15 forks source link

Word stretching is not applied to justified Tamil text when large gaps appear between words. #80

Open r12a opened 4 years ago

r12a commented 4 years ago

Tamil words can be quite long, which can cause problems for justified text, especially in narrow columns, because large gaps can appear between words, or at the end of a line if only one word fits on that line. To mitigate this, especially in the absence of hyphenation, lines that are justified in Tamil should automatically stretch words to fit in the following cases.

When only one word fits on a line, that word should be stretched to fit the whole line. Here is an example:

justification_one_word

Where a small number of words appears on a line, the words on that line may also be stretched, so as to reduce the inter-word spacing. Here is an example:

justification_in_newsprint

Note that justification doesn't stretch words unless one of these cases applies.

However, a distinctive feature of Tamil is that the adjustments applied should equally expand the space between all unconnected, spacing glyphs (including the space between various vowel-signs and their base), rather than solely putting space around syllables, grapheme clusters or even code points. Here is an example:

partridge

More information:

Specs: css-text provides an auto value for the text-justify property, which relies on the UA to determine the justification algorithm to follow, based on a balance between performance and adequate presentation quality, and taking into account writing system and language.

The spec says: For example, the UA could use by default a justification method that is a simple universal compromise for all writing systems—such as primarily expanding word separators and between CJK typographic letter units along with secondarily expanding between Southeast Asian typographic letter units. Then, in cases where the content language of the paragraph is known, it could choose a more language-tailored justification behavior e.g. following the Requirements for Japanese Text Layout for Japanese [JLREQ], using cursive elongation for Arabic, using inter-word for German, etc.

Tests & results: interactive test, With CSS set to text-align:justify; text-justify:auto, when large gaps appear between justified words in Tamil the browser will automatically reduce the gaps by stretching words on the affected line.

interactive test, With CSS set to text-align:justify; text-justify:auto, when a narrow column means that only one Tamil word fits on a line, the word will be stretched to fit the whole width of the column

interactive test, When inter-character spacing is applied to Tamil, equal space is added between all unligated spacing glyphs, including between the glyphs forming vowel-signs and their base characters, but ligated glyphs and non-spacing combining characters are not separated from the base.

Browser bug reports: GeckoBlinkWebkit

Priority: The impact of this is marked as advanced, because oversized gaps in text seem to be fairly common in Tamil printed materials, but perhaps this should be basic instead? Availability of hyphenation for Tamil would also ease this issue, but that is not yet supported by browsers.

r12a commented 4 years ago

The first comment in this issue contains text that will automatically appear in one or more gap-analysis documents as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the document. Proposals for changes or discussion of the content can be made in comments below this point.

Relevant gap analysis documents include: _Tamil_

xfq commented 11 months ago

Added a link to Tamil Layout Requirements.