w3c / sealreq

Southeast Asian layout task force
34 stars 5 forks source link

Need a standard way to add wider spaces between Thai sentences #49

Open r12a opened 3 years ago

r12a commented 3 years ago

This issue is probably also applicable to other languages.

In principle, Thai uses large spaces between sentences, and small spaces in other places (eg. for separating sub clauses).

There is no standard way to achieve the different width spaces, and the reduction of multiple white-space on the Web to a single space has not caused major objections. Still, it would be useful to propose a standard approach, so that people would have the means to easily make the difference when they want (eg. to by regularly providing additional keys on a keyboard), and to facilitate parsing and segmentation of natural language text.

It appears that some people use U+2003 EM SPACE for the wider space. Apparently, minority languages using the Khmer script do the same (although they also appear to use U+2006 SIX-PER-EM SPACE for the smaller space, which doesn't appear to be a requirement here).

See this discussion thread: Variable sized spaces in Thai

Specs: Specifications currently offer no advice on this topic.

Need to check whether the necessary white-space rules apply to EM SPACE.

Tests & results: interactive test, Font support for EM SPACE
Shows that no Thai fonts on Mac OS X or Windows 10 have a glyph for EM SPACE, with the exception of Arial Unicode MS and Tahoma.

Priority: As mentioned before, people writing Thai are getting by without the wide space at the moment, but don't know how best to produce one when they want. Usage is likely to be associated with careful typography much of the time, so advanced priority seems appropriate.

r12a commented 3 years ago

The first comment in this issue contains text that will automatically appear in one or more gap-analysis documents as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the document. Proposals for changes or discussion of the content can be made in comments below this point.

Relevant gap analysis documents include: _Thai_

xfq commented 3 years ago

In https://drafts.csswg.org/css-text-3/#white-space-rules :

Except where specified otherwise, white space processing in CSS affects only the document white space characters: spaces (U+0020), tabs (U+0009), and segment breaks.

It does not seem to affect EM SPACE, but we'd better check whether there's "specified otherwise" or not.

r12a commented 3 years ago

The other thing to check is what is the impact on justification. I believe EM SPACE doesn't expand, which may be an issue. Perhaps in some cases the regular spaces will expand to be as wide or wider than the EM SPACE.

r12a commented 3 years ago

I propose that we continue these technical discussions at https://github.com/w3c/sealreq/issues/46.