w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.46k stars 657 forks source link

[css-text-4] Dealing with unusual line-break/hyphenation rules #2976

Open r12a opened 6 years ago

r12a commented 6 years ago
  1. Breaking Within Words https://drafts.csswg.org/css-text-4/#hyphenation

Certain features associated with line-breaking are unusual by western standards, and i'm not sure whether to class them as line-break rules, or specialised hyphenation rules.

Two examples are given in the last section of the following article, Other special rules.

The Javanese example might be closer to what we normally think of as hyphenation: a glyph that only appears when a word is broken, and isn't in the text stream. However, it only occurs in a specific situation.

The Tibetan example relates to situations where a single syllable followed by a shad is wrapped. In my mind, this may be more aligned with line-break rules than hyphenation.

The purpose of this issue is to ask whether anyone has any suggestion about how to treat things of this kind, and whether we should add anything to the CSS spec to either refer to or cover this kind of behaviour.

r12a commented 6 years ago

Considering that Dutch hyphenation has to cope with things like cafeetje → café-tje, autootje → auto-tje, or skiërs → ski-ers, not ski-ërs, perhaps the javanese case is a similar feature.(?)

fantasai commented 1 year ago

This is considered normal line-breaking behavior in these languages, right? I think it would be appropriate for the CSS spec to require these behaviors, but we'd need something to cite. If Unicode or i18n can provide a definitive reference, we can cite that.

Assuming it's normal line-breaking, I wouldn't call the Javanese behavior hyphenation; hyphenation is breaks within “words” that usually would otherwise stay together, is enabled only sometimes, is de-prioritized compared to regular breaks, and typically involves inserting a visual marker indicating the break.