w3c / alreq

Documenting gaps and requirements for support of Arabic and Persian on the Web and in eBooks.
Other
60 stars 31 forks source link

Are vertically-misaligned characters joined? #203

Open r12a opened 4 years ago

r12a commented 4 years ago

This issue seeks to clarify requirements in support of the discussion at https://github.com/w3c/svgwg/issues/631

SVG allows a content author to automatically break a word into typographic units, which are then displayed at different positions. See https://www.w3.org/TR/SVG2/text.html#TSpanAttributes, where it says:

If a comma- or space-separated list of n s is provided, then the values represent new absolute X (Y) coordinates for the current text position for rendering the glyphs corresponding to each of the first n addressable characters within this element or any of its descendants.

For example, if you have <text y="1em" dy="20 20 20 20 20">peace</text>, you'd expect to see something like:

 p
    e
        a
            c
                e

The question is, what should happen for a cursive script, such as Arabic, N'Ko, Adlam, Syriac, etc. ? Should the separated letters retain their cursive shapes? I spoke with several native users of Persian and Arabic at TPAC, and they said that letters should be isolated, in the same way as letters in crosswords and occasionally in other contexts have isolated characters.

There was also the expectation that the placement of characters would cascade from right to left automatically.

Therefore, given <text y="1em" dy="20 20 20 20 20">بغداد</text> you would expect to see:

                    ب
                غ
            د 
        ا
    د

Note that the SVG requirement makes allowances for things such as not splitting grapheme clusters, and not splitting required ligated forms. So <text y="1em" dy="20 20 20 20 20">سلام</text> would give:

            س 
        لا
    م

In discussion with the SVG group, we were concluding that the same thing would happen if you spaced the letters out horizontally, rather than vertically. If a content author actually wanted to do 'letter-spacing', where the baseline is stretched between characters, this automatic approach is not appropriate, and a proper letter-spacing feature should be used.

Any further observations ?