Open fantasai opened 5 years ago
Tibetan intersyllabic tsheg?
@asmusf I was thinking about that, yes. Also the Ogham space mark.
Seems to me that there are a number of characters that are or were (in archaic script use) used in place of spaces. But i'm surprised that this is an issue. @fantasai could you point to the part of the spec that is in question?
@r12a https://www.w3.org/TR/css-text-3/#line-break-transform Where we handle ZWSP, it might make sense to handle other word separators that aren't spaces.
I suspect that what distinguishes ZWSP and TSEK in these circumstances is that [thai etc character][zwsp][whitespace] is likely to be an error, whereas [tibetan character][tsek][white space] is not (even when spaces in tibetan would theoretically use NBSP), or if it is an error this can only be detected by understanding the text and/or the intention of the author. Same goes for ethiopic word space.
I suspect that, mostly, content authors just need to be careful about how they compose the source text, so that spans of text that shouldn't include spaces don't, even if they are using an editor or tool that wraps lines automatically. It seems to me that that's also the approach you'd need to take when composing text in archaic hangul styles, where they didn't use spaces between words.
(Btw, this probably has implications for some aspects of Semantic linefeeds if the language of the text used doesn't employ spaces as word separators.)
I think the goal should be that other languages are not at a significant disadvantage in how they organize their source code, i.e. make semantic linefeeds possible for all languages where we can plausibly do so without breaking existing content.
I'm not sure what that means for what characters we should consider... I'm pretty sure that the Ogham space mark and Ethiopic word space should collapse with subsequent spaces, it doesn't make sense to want both. But for Tibetan, I'm not sure, does it really use spaces after tsek marks? (I know they do after shad, but that's a different character.) @r12a
The CSS Working Group just discussed Collapsible breaks adjacent to word separtors
.
The CSS Working Group just discussed Removing collapsible linebreaks"
, and agreed to the following:
RESOLVED: Punt "removing collapsible linebreaks adjacent to work separators" to level 4
We have rules in place that eliminate line breaks if they are adjacent to ZWSP, leaving behind the ZWSP when assembling the paragraph text form multiple lines of source text. However, we didn't consider explicit word separators such as the Ethiopic word space. Probably all “word separators” (other than space and nbsp) should have the same behavior as ZWSP here.