w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.46k stars 657 forks source link

[css-text-3] 'word-break: keep-all' needs to be defined more precisely #1619

Closed kojiishi closed 6 years ago

kojiishi commented 7 years ago

The current 'word-break: keep-all' has:

Breaking is forbidden within “words”: implicit soft wrap opportunities between typographic letter units are suppressed, i.e. breaks are prohibited between pairs of letters (regardless of line-break settings) except where opportunities exist due to dictionary-based breaking. Otherwise this option is equivalent to normal. In this style, sequences of CJK characters do not break.

But this isn't very clear which break opportunities should really be suppressed. Blink has an impl, but I'm afraid this isn't interoperable.

One possible idea we could borrow from ICU discussion in ICU notes F2 is:

For "keepall", if the class is Hangul (H2, H3, JL, JV, JT) or ID, remap to AL

This is a bit different from what the current wording says, but suited for the original purpose and easy to make it interoperable.

I remember, this text was written before we generally try to move Unicode line break property, and was not updated since then while other values such as 'break-all' was changed, is this correct memory, @fantasai ?

litherum commented 7 years ago

Retitling due to Betteridge's law of headlines

fantasai commented 6 years ago

I think the spec is clear: Hangul is covered under the definition of “typographic letter unit”; I don't think there are any Hangul which is not a Letter.

Breaking is forbidden within “words”: implicit soft wrap opportunities between typographic letter units (or other typographic character units belonging to the NU, AL, AI, or ID Unicode line breaking classes [UAX14]) ...

@kojiishi Let me know if this answers your question.

fantasai commented 6 years ago

The behavior for Symbols and Punctuation is clarified in https://github.com/w3c/csswg-drafts/commit/b3b36847b440c20b39db3987a3df68e9a892b55a