[css-text] Preventing too-short final lines of blocks (Last Line Minimum Length)

s10wen commented 5 years ago

CodePen example: https://codepen.io/s10wen/pen/GPWWyP?editors=1100#0

Tweet + replies: https://twitter.com/s10wen/status/1076079575506083840

Wikipedia explanation: https://en.wikipedia.org/wiki/Widows_and_orphans

CSS Text 3 w3 Spec: https://www.w3.org/TR/css-text-3/

The above links have led me here, to further pursue this. I'm wondering if anything currently exists, or could be implemented to handle this. My idea is that orphan: 2 would always leave two strings of text together, please see the CodePen for an example. Or, it could be that orphan: true, would mean that orphans always had at least 2 words.

frivoal commented 5 years ago

There are already controls for widows and orphan lines and page/column breaks in https://drafts.csswg.org/css-break/#widows-orphans.

A control for widowed words on the last line could be useful, but it doesn't exist yet. However, I suspect it needs to be paired with a better line breaking algorithm than the current greedy one to achieve good results. If all the lines of the paragraph are re-balanced to push the needed word(s) to the last line, all may be fine, but just pulling from the line before last is probably going to lead to sub-optimal results. Also, we'd need to figure out what this means in languages that do not separate words with spaces, such as Japanese or Chinese.

I suppose there's prior art in other software, and we should have a look at what they do there. InDesign maybe?

astearns commented 5 years ago

We have previously talked about the idea of being able to specify a minimum last line length, but in characters or a percentage of width, not in the number of words:

https://drafts.csswg.org/css-text-4/#last-line-limits

jonjohnjohnson commented 5 years ago

@s10wen chiming in on current work arounds. If you're willing to put presentational matters in html, like your demos use of <br>, it's instead optimal to place a   character between the last two words, making it so they don't have to create a new line if they fit on their current line together.

aaaxx commented 5 years ago

Wouldn't a <span> with white-space: nowrap be a cleaner solution?

s10wen commented 5 years ago

Hey all, thanks for the conversation around this.

https://drafts.csswg.org/css-break/#widows-orphans seems to be most likely what I'd like to see. Is there anywhere I can see the progress of this being implementing to browsers and test?

AmeliaBR commented 4 years ago

I've just changed the title of this issue to make it clearer that we're talking about orphaned words on a line, not orphaned lines on the top/bottom of a page or column (which is what the widows and orphans properties are about).

I came here because of this thread by the CSS-Tricks team about workarounds to avoid bad-looking breaks. This is clearly a common situation where people are modifying their markup to get typographically pleasant results, and that is really a problem that CSS should try to solve.

CSS Text 4 has a heading for this topic ("Last Line Minimum Length"), with an issue summary but no solution, with a cross-reference to a 2015 mailing list discussion. Copying over the current issue text from the spec:

Issue is about requiring a minimum length for lines. Common measures seem to be

At least as long as the text-indent.

At least X characters.

Percentage-based.

Suggestion for value space is ''match-indent | <length> | <percentage>'' (with Xch given as an example to make that use case clear). Alternately could actually count the characters.

It’s unclear how this would interact with text balancing (above); one earlier proposal had them be the same property (with 100% meaning full balancing).

People have requested word-based limits, but since this is really dependent on the length of the word, character-based is better.

My own opinions to get the discussion started:

I think the property should support a minimum number of characters (as an alternative to minimum % of inline size or minimum line length) for the final line. That covers most typographic style guides while still avoiding any discussion about what is or isn't a word across different languages, or when words are broken by hyphenation.

Since we're assuming that most implementations will be using a greedy line-breaking algorithm, maybe the property could accept a second value that would be the minimum number of characters for previous lines, at which point no more attempt at removing the widow should be made. (If the second value isn't specified, an auto behavior applies: don't make the second-to-last line shorter than the last line while trying to make the last line longer! In fact, this should apply regardless of whether you also specify a minimum length for the second-to-last line!) If a more complicated text-wrap justification algorithm applies, the rules about "second-to-last line" apply to all previous lines in the block.

min-last-line: 8 / 20; /* make sure the last line is at least 8 characters long,
               unless doing so would make the previous line less than 20 characters long */
min-last-line: 3em; /* make sure the last line is at least 3em long,
               unless doing so would make the second-to-last line shorter */

Or maybe it would be better to have two properties: min-line-last which is specifically about avoiding the orphaned short line at the end of the block, versus min-line that is a hint about the ideal minimum length for any lines except the last. In addition to avoiding over-compensation when padding the last line, it could define a trigger at which point smarter/more expensive text-wrap and hyphenation strategies should be employed.

PS, I think it would also be helpful if there were some figures in the spec about widows and orphans to make it clear that those properties aren't about bad line breaking, but about bad block breaking. If anyone wants to create that, I'm sure a PR would be welcome!

ayyash commented 2 years ago

I want to add my two cents to push this conversation further. The following HTML after being "prettified" will break:

<blockquote>
    Lorem ipsum then prettify will push closing tag to a new line
</blockquote>

And in CSS, this quotation mark might appear on a new line, and there is no way around it, except to disable prettifying code and making sure blockquote ends on the same HTML line of code. I wish there was a way to target :last-line the way we target :first-line.

blockquote:after {
    content: '"'
}

nigelmegitt commented 1 year ago

There's a similar looking problem around too-short first lines, if the in-line alignment does not match the reading direction.

See the BBC Subtitle Guidelines section on Breaks in justified subtitles for example:

tabatkins commented 1 year ago

That only happens with explicit breaks tho, correct? We'll otherwise always fill the first line approximately the same as subsequent lines.

clagnut commented 1 year ago

Further to my comments in the above thread, and following @kojiishi's response, I talked at length with other designers at Clearleft and it was surprisingly difficult to come to a definitive conclusion, particularly around the exceptions.

By which I mean: put simply one doesn't want just a single word on the final line of a block, but what's the effect of bringing down a word from the previous line in order to address that? If you were fixing this manually, there might be a ripple affect back up the paragraph until the best overall text shape is achieved. I doubt that's something a browser could afford to do, given the (understandable) reluctance to implement any justification routines beyond the crudest greedy method.

The best conclusion we could come up with was something similar to the solution proposed by @AmeliaBR. Set a minimum character length for the final line along with a maximum number of characters to bring down from the previous line. This would be conceptually similar to the hyphenate-limit-chars property.

min-last-line: 12 6

where 12 is the minimum line length in characters, and 6 is the maximum number of characters that can be brought down from the previous line to make that so. If the 6 is omitted, it would assumed to be equal to the 12.

It might be useful to some people for the same approach to be expressed as percentages of box width instead:

min-last-line: 20% 10%

where 20% is the minimum length of the final line in terms of percentage of box width, and 10% is the maximum length that can be removed from the previous line.

It might be that the two methods (chars and %) could be mixed.

clagnut commented 1 year ago

I've put together a (very) rough-and-ready proof of concept here.

The idea is to have something to test out the concept of a minimum final line length and maximum amount of text that can be brought down from the line above to address that.

Please feel free to have a play, copy, adapt and generally improve. Comments very welcome, here preferably.

kojiishi commented 1 year ago

A question on an edge case came up in mind: what to do with a paragraph with "[short word] [long word] [short word]"? An example (there might be better examples but...):

It's uncopyrightable,
no?

or

It's
uncopyrightable, no?

The former is better, no?

clagnut commented 1 year ago

A question on an edge case came up in mind: what to do with a paragraph with "[short word] [long word] [short word]"? An example (there might be better examples but...):
It's uncopyrightable,
no?
or
It's
uncopyrightable, no?
The former is better, no?

Agreed, the former is better. This would be handled by the min-last-line: 20% 10% rule which says that 10% is the maximum length that can be brought down from the penultimate line.

litherum commented 1 year ago

I dup'ed https://github.com/w3c/csswg-drafts/issues/2396 to this issue.

From that other issue, I said:

I actually implemented this years ago (named -apple-trailing-word: -apple-partially-balanced) as a nonstandard property because we got some internal requests for this. (I then removed support once our internal teams stopped using it). And now we're getting more internal requests for this.

It's not just internal requests, though:

https://stackoverflow.com/questions/4823722/how-can-i-avoid-one-word-on-the-last-line-with-css

https://stackoverflow.com/questions/31974448/how-can-i-prevent-having-just-one-hanging-word-on-a-new-line-in-an-html-element/31974553

This is something we'd like to see added to CSS.

We could either do it the way I did years ago (a simple on/off switch) or we could mirror the design of orphans and widows and have it take an integer value.

We are now getting even more requests for this. (Every time this comes up, I always go searching for which property controls this, only to be surprised yet again that there is no way to do this and it's impossible.)

It's probably also worth noting that the requests we have for this feature are not about the number of "words" on that last line, as that necessarily doesn't actually solve the visual problem when the last n words are short (or, you're writing in Chinese and the last few words are each just single characters). The request, instead, is to say "the last line is at least x% of the width of the block container."

xfq commented 1 year ago

I suppose there's prior art in other software, and we should have a look at what they do there. InDesign maybe?

In InDesign, because what you see is what you get, adjusting some inconspicuous gaps between characters in the penultimate line should work.

Another approach is applying a GREP style, indicating that the last few characters/words in a paragraph cannot be broken into two lines.

Wouldn't a <span> with white-space: nowrap be a cleaner solution

If there are not many paragraphs, using <span>s with white-space: nowrap or Zero Width Joiners (for writing systems like Chinese, Japanese, Batak, Tai Le, Khmer, Thai, etc.) or non-breaking spaces between the last few words probably works fine (although ZWJs might have an impact on glyph rendering), but if we want this for the entire document, then it is too much trouble.

css-meeting-bot commented 1 year ago

The CSS Working Group just discussed [css-text] Preventing too-short final lines of blocks (Last Line Minimum Length), and agreed to the following:

RESOLVED: Add a control that is either a property or a value that causes UAs to make the last line longer than it would've originally done unless that was a bad idea

The full IRC log of that discussion

<fantasai> myles: this is a small issue ....
<fantasai> myles: We have had many requests throughout the years where typographers and designers have come to us and show us a paragraph on the web page
<fantasai> myles: they'll point to last line and say, 'this last line is tooooo narrow"
<fantasai> myles: this has a name, it's called orphans and widows
<fantasai> myles: also term has two meanings
<fantasai> myles: CSS has support for the other meaning (pagination)
<fantasai> myles: but for this one, doesn't
<fantasai> myles: this is one of our highest requested text-related features
<fantasai> myles: so it would be cool if CSS could solve
<fantasai> myles: problem here is that last line is too narrow, so get wide paragraph and maybe one word on last line. Looks bad
<fantasai> myles: more nuance, but what I will say is, I think there's two potential solutions
<fantasai> myles: one is a new property, and one is a change to value space of `text-wrap: pretty`
<Rossen_> q?
<fantasai> myles: so could invent a new property or add a thing that when you do pretty, try to focus on the last line
<Rossen_> ack emilio
<fremy> lol
<florian> q+
<iank_> q+
<fantasai> florian: text-wrap: pretty solves this and more, and is expensive
<fantasai> florian: and that's important
<fantasai> florian: if it wasn't expensive, just using pretty would be fine
<fantasai> florian: there are terrible solutions to this problem
<fantasai> florian: if you implement one that is "good enough"
<Rossen_> ack florian
<fantasai> florian: is there enough perf difference with pretty that it's worth a separate control
<fantasai> hober: very significant perf difference
<fantasai> myles: naive implementation of pretty is exponentially bad perf
<fantasai> myles: whereas an algorithm that just focuses on this problem would be at worst linear, but almost constant time
<nicole> q+
<Rossen_> ack iank_
<fantasai> iank_: yes, expensive, but we might have different perspectives on how expensive we're willing to tolerate for pretty
<fantasai> iank_: lot of nuance there, let's not get into it now
<fantasai> iank_: from our perspective, if there is a control, it would be nice if it could also control pretty
<fantasai> iank_: fundamentally, pretty does have a lot of knobs like "how much to bias for x consideration"
<fantasai> myles: if independent control for last line and pretty, browser could see and modify pretty to focus on last line
<fantasai> iank_: potentially
<fantasai> myles: that sounds great
<Rossen_> ack nicole
<fantasai> nicole: I wanted to ask, how many lines would be impacted by that
<fantasai> myles: I think we're flexible here, not super clear what the spec should say
<astearns> -1 to only stealing words from one line
<fantasai> myles: if we were to implement this, first version would start at 1 line and then iterate from there and see if need to increase
<fantasai> nicole: similar to headline balancing?
<fantasai> iank_: not really
<florian> q+
<fantasai> hober: taking a first pass would only use one line, but I can imagine empiricially discovering that we can tolerate 3-4 and might have a spec, but let's not prematurely decide
<Rossen_> ack florian
<fantasai> florian: are you aiming for a yes/no property or are you thinking of giving author control like at least 3em or at least 30%
<fantasai> myles: flexible
<fantasai> myles: Firstly, we now using words is long
<fantasai> myles: not clear, in i18n context, what exactly a word is
<astearns> s/long/wrong/
<fantasai> myles: from implementation perspective, can make a boolean
<fantasai> myles: if authors need more control, can add
<fantasai> myles: when authors request this, they usually request a percentage
<fantasai> myles: e.g. at least 15%
<florian> q?
<Rossen_> s/15%/50%/
<fantasai> iank_: is that what they actually want, or think that's the tool that indesign provides...
<fantasai> myles: I don't know, but my proposal is a boolean switch
<florian> q+
<astearns> q+
<fantasai> myles: and as implementations progress, we can see if that makes sense or not
<florian> q- later
<fantasai> smfr: should we just resolve on adding a property without specifying if boolean or not?
<Rossen_> ack astearns
<fantasai> astearns: for a prperty that does only this one thing
<fantasai> astearns: I would advocate strongly for just a boolean switch
<fantasai> astearns: anything more finely grained is really going to have to be weighed against other line-breaking considerations
<fantasai> astearns: and needs to modify results of pretty
<fantasai> astearns: if we're separating the two, then the simple thing, should just be a boolean
<fremy> q+
<Rossen_> ack florian
<fantasai> florian: I think I support this because of the perf difference
<fantasai> florian: but even then it does feel like it's a variant of pretty, you've decided what you care about
<fantasai> florian: so is it really separate from pretty?
<fremy> wanted to say the same thing
<fremy> q-
<emilio> fantasai: There are a lot of knobs that factor into pretty and a lot of them are already separate knobs
<fremy> q+
<emilio> ... In level 4 or some other we had word-spacing and letter-spacing give you the optimal value but we also give you a range for the line-breaker to play with
<iank_> q+
<emilio> ... that's a factor into the line breaker
<emilio> ... same for hyphenation controls
<emilio> ... these are already split out into multiple controls
<emilio> ... turning on pretty shouldn't need to redeclare your controls
<emilio> ... so should cascade separately
<emilio> ... I agree with myles, we should have this tweak
<florian> [Florian is convinced]
<emilio> ... I'd like something that can be applied to the html spec and not spin for 10 minutes
<emilio> ... I also agree with astearns that you have to look a more than one line
<nicole> q+
<emilio> ... A boolean switch is fine but we should define this thinking of extending it to percentages too
<myles> sooooo `minimum-last-line-length: normal | auto [maybe more in the future]`
<emilio> fantasai: If we spec it out we need to pick a name that's gonna work for both
<emilio> ack fantasai
<florian> q+
<fantasai> fremy: I have a similar thought, that just looking at one line doesn't cut it
<Rossen_> ack fremy
<fantasai> fremy: you will end up with a triangle. If you have only two lines it's fine, if you have three, you'll have a long first line, then a smaller second line, and even smaller third line. Very strange
<fantasai> fremy: so in a way I'm struggling, if you have elements with more than two line, oh an implementation can produce good results without balancing lines from where it's stealing the words
<fantasai> fremy: I don't think I can see how to make it different from pretty
<fantasai> fremy: if you're doing that you're back with pretty
<fantasai> astearns: I am convinced that there is a faster linebreaking algorithm that would only do this and not look at the full "pretty" penalty values
<fantasai> astearns: it doesn't have to be as expensive as the full pretty implementation to do just this one thing
<fantasai> iank_: but you can bias the pretty implementation
<emilio> `text-balance: pretty-fast` :-)
<Rossen_> ack iank_
<fantasai> iank_: One thing on controls, it's not clear how this control would apply to `text-wrap: balance`
<fantasai> fantasai: it wouldn't
<astearns> but I absolutely agree that we should avoid introducing the line wrap triangle shapes that fremy described
<fantasai> myles: `balance` will win. it handles last line by itself
<fantasai> iank_: it feels to me that we have at least 3 line-wrapping algorithms, might have others in the future
<fantasai> iank_: this control wouldn't affect balance
<fantasai> fantasai: it also doesn't affect nowrap
<fantasai> iank_: sure
<hober> qq+
<fantasai> iank_: I somewhat agree with fremy that you might be getting into pretty territory
<ntim> q+
<fantasai> iank_: I'm not convinced by the global control argument
<fantasai> hober: if setting balance means it doesn't apply, then you aren't setitng whatever this is
<ntim> +1 to what hober said
<fantasai> hober: you're not setting it if you're not setting `text-wrap: balance`
<Rossen_> ack hober
<Zakim> hober, you wanted to react to iank_
<Rossen_> ack nicole
<fantasai> nicole: does anyone want orphans? is anyone like "I want to turn off nice ending to my text"
<fantasai> astearns: yes, because you will get faster text composition
<Rossen_> ack florian
<fantasai> florian: I suspect we all agree, but will say explicitly, when that is turned on it is a request for the browser to *attempt* to make the last line not terrible
<fantasai> florian: but if can only do this by making some other line terrible, shouldn't do it
<fantasai> [agreed]
<fantasai> florian: also agreeing with Tess that it's also a text wrap value
<fantasai> florian: and can have it as an additional keyword
<fantasai> myles: requested resolution isn't any particular grammar
<fantasai> ntim: I want to echo what tess said, it feels like text-wrap extension
<SebastianZ> q+
<Rossen_> ack ntim
<Rossen_> ack fantasai
<astearns> fantasai: so all of this is why I did not want this in text-wrap
<ntim> q+
<astearns> fantasai: whether and how you are wrapping should be separate
<emilio> fantasai: because it needs to cascade separately
<florian> +1
<emilio> ... you want to set the controls in a single place in your stylesheet
<myles> q+
<astearns> and if we want this to be extensible as a text-wrap: pretty control it needs to be separate
<emilio> ... this needs to be a separate thing and honestly I think `pretty` should have as well
<emilio> ... I think this gets into how we're wrapping and that should only set once
<florian> q?
<ntim> q-
<fantasai> This is why I was against "text-wrap: pretty" as a syntax in the first place
<emilio> Rossen_: let's pause for a second. There's a clear proposal for a clear problem
<emilio> ... There are also ideas about how to do it performantly...
<emilio> ... we have plenty of engineers on the room, and we're getting into how to solve the issue, let's not do that
<emilio> ... let's go through the queue if you want to discuss syntax or how to solve it
<emilio> fantasai: what's the proposed solution?
<emilio> Rossen_: to have a property or a value that solves this problem in a more performant way to pretty
<Rossen_> ack SebastianZ
<emilio> SebastianZ: iterating on nicole's question
<emilio> ... if the algo can be made pretty fast, why can't it be the default?
<emilio> ... we also have text-decoration skip-ink
<emilio> ... but it was worthwhile having as a standard
<florian> [I think think we should open a separate issue to move "balance | stable | pretty" out of text-wrap, and probably add "avoid-orphans" there]
<emilio> Rossen_: let's close that bridge when we get to it
<Rossen_> ack myles
<emilio> PROPOSED: Add a control that is either a property or a value that causes UAs to make the last line longer than it would've originally done unless that was a bad idea
<emilio> RESOLVED: Add a control that is either a property or a value that causes UAs to make the last line longer than it would've originally done unless that was a bad idea

r12a commented 1 year ago

Just an observations on the question of how to specify the length of the last line (and possibly also the gap on the previous line). It seems to me that using line length percentages, based on the rendered text, is better than counting characters.

Counting characters is problematic in a large number of non-Latin languages because they use (often multiple) combining marks, which are combined into the same 2-dimensional space as a base character. For example, 10 characters in some languages can be very short, compared to 10 characters in English, eg. أَنْتُنَّ contains 9 characters, but is only about 3-4 Latin characters in width. Similarly, an emoji such as 👨‍👩‍👧‍👦 contains 7 characters in about the width of a couple of english letters.

clagnut commented 1 year ago

It seems to me that using line length percentages, based on the rendered text, is better than counting characters.

Agreed. See prior comments: https://github.com/w3c/csswg-drafts/issues/3473#issuecomment-1474837736

litherum commented 1 year ago

Yep.

The request, instead, is to say "the last line is at least x% of the width of the block container."

LeaVerou commented 1 year ago

Hi folks. Writing with my TAG 🎩 on.

Given that Chromium is very keen to ship text-wrap: pretty and orphan control is one of the main heuristics employed in it (in fact, it was the only one when it was first submitted for TAG review), it would be good to finalize the name of this property plus a way to use it to avoid orphans (even if it gets more syntax in the future), to prevent text-wrap: pretty being evangelized further as a way to avoid orphans.

Switching to my CSS WG 🎩 now to discuss specific syntax:

What about breaking text-wrap into longhands to allow customizing some or all aspects of line breaking, and have keywords like balance or pretty correspond to certain values for these longhands. Orphan control could then be achieved via text-wrap-orphans, with high level values like avoid and normal for an MVP, while we debate syntax for giving authors more control. text-wrap pretty would then correspond to that semi-magical value.

frivoal commented 1 year ago

I am unconvinced about decomposing balance or pretty into a bunch of individual knobs.

The interaction between these knobs is about as interesting as the knobs themselves: if pretty implies "avoid orphans" and "avoid rivers" and "avoid several hyphenated lines in a row", do we then need not just these three, but also a choice between "avoid orphans unless it would create rivers" vs "avoid orphans even if it creates rivers"? "how about avoid orphans even if it creates rivers or consecutive lines with hyphenations, as long as it doesn't create both, but either way, don't be more than 250% percent slower than brute force line breaking"? A good algorithm for pretty needs to balance a whole bunch of tradeoffs.

Yes, there's some measure of subjectivity in those tradeoffs, so providing author control is tempting, but:

designing the right knobs to enable meaningful choices without being overwhelming seems challenging
the existence of specific knobs will constrain what the underlying algorithm can possibly be.
depending on the specific knobs we come up with, it seems fairly likely that some combinations that aren't actually desired by anyone will become possible to express, causing unnecessary complexity in implementations.

clagnut commented 1 year ago

The interaction between these knobs is about as interesting as the knobs themselves: if pretty implies "avoid orphans" and "avoid rivers" and "avoid several hyphenated lines in a row"

especially as there is already - in theory - control for limiting consecutive hyphens with hyphenate-limit-lines https://www.w3.org/TR/css-text-4/#propdef-hyphenate-limit-lines

frivoal commented 10 months ago

RESOLVED: Add a control that is either a property or a value that causes UAs to make the last line longer than it would've originally done unless that was a bad idea

Like text-wrap-style: pretty, this opts into a different line breaking algorithm aimed to make things look better, for some definition of better. And arguably, since the definition of pretty is quite open ended, a user agent could choose to implement pretty in just the right way to make nice last lines. But they might also do a whole lot more, and nothing requires that they pay special attention to the last line, and the performance profile between something that just cares about the last line and something that cares about the whole text is likely different. So we're basically looking at another variant of pretty but with a different bias in terms of what tradeoffs to make.

For the sake of the argument, let's call that text-wrap-style: pretty-last-line.

Both pretty and pretty-last-line:

SHOULD bias for better layout over pure speed when making line breaking decisions
SHOULD look at multiple lines for doing so
are otherwise equivalent to auto

pretty:

SHOULD make the whole text pretty (which may involve making the last line not too short, avoiding rivers, avoiding starting 3 lines in a row with the word "the", or whatever else the browser wants to do for good typography's sake)
MAY compromise performance while doing so

pretty-last-line:

SHOULD make the last line pretty (i.e. not too short)
SHOULD avoid making the rest ugly in the process (e.g. avoid excessive under filling of previous lines)
SHOULD prioritize performance over whole-text-prettiness

gwern commented 7 months ago

If I may leave a note as an author: we noticed text-wrap: pretty had unexpectedly shipped in Chrome/Chromium, ones we had installed too, and, excited by the prospect of real Knuth-Plass layout in >60% of browsers (according to CanIUse), took a look at how it affects our website.

Unfortunately, pretty in Chrome seems to not really do K-P as one would expect, and to be highly opinionated. While we like how it handles orphaned-words to bump them from 1 to 2, usually, we do not like the extremely aggressive removal of hyphenation, which results in drastic s t r e t c h e d lines with our justified text*. (Here are 34 before-after pairs on Windows Chrome 123 on my site: https://share.obormot.net/temp/text-wrap_screenshots.zip Most of them show this behavior. This is on a wide desktop screen; presumably, if we checked thoroughly on very narrow screens like mobile, the stretching behavior would be far worse, because it usually is and that is why we had to add Hyphenopoly to fix bad browser hyphenation.)

If we could opt into the orphaned-words bit, we would, and it's not obvious to me why bumping a word to the next line would have to change all the hyphenations & multiple lines as its tradeoff, but pretty seems to be a package deal right now, so we can't use it. I also can't seem to find any CSS property, under any name, which would achieve the same effect: the CSS widow/orphan line properties do totally different things, most people just burble enthusiastically about how pretty solves all your word-orphan problems (which it may, but little mention of drawbacks), and the suggestions after that seem to be either 'manually insert no-break space at the end of paragraphs everywhere forever' or 'run some wacky JS'.

The opinionated package deal is also worrisome because given how aggressive the treatment of hyphenation is, and how this is not apparently part of any standard, why expect other browsers to adopt this opinion? We definitely wouldn't want to adopt it for Chrome, and then 5 years later discover (or worse, not discover) that it looks terrible on Firefox or Safari after they ship their particular flavor...

(Looking into this was not assisted by the very confusing terminology. For example the official Chrome blog which is some of the only documentation on what the shipped pretty does mentions that its treatment of orphan-words differs from the CSS orphans... which isn't about words at all, it's about lines.)

* almost all of the discussion & design doc seems to assume left-justification only, and ignore center, right, & fully-justified text, so perhaps this was just an oversight in the tuning?

astearns commented 7 months ago

Thanks for all the examples, @gwern. I’m not sure I agree they are all failures of the current implementation. Most of the after results appear better or at least as good as the befores to me. For instance, 49/50 does have wider spacing but I think it’s arguably better (spacing is still consistent across the paragraph and a 2-hyphen ladder is removed). And 53/54 does have wider spacing but it might be a case where justification in general is causing problems (there is an unaffected line starting with “reader mode” that looks bad both before and after).

The ones I do see a problem with are

65/66 is a bad result, I agree 69/70 the sidenote is worse, but the main column change looks OK 71/72 is worse, likely because the current implementation isn’t looking back enough lines 81/82 sidenote 31 is worse, but the rest of the changes look OK 89/90 is worse, but the narrow columns make this a hard case. Perhaps the weighting against a short last line should not have resulted in any change for the first two paragraphs, but the third seems fine

gwern commented 7 months ago

I didn't say the current Chrome implementation of pretty had failed - just that it was opinionated, and we had other opinions. (You and the Chrome dev Ishii perhaps have more tolerance of spacing and more dislike of hyphens than we do, and there's no arguing taste there; but we've been burned by some extremely bad looking lines on mobile when line-stretching happens, so we worry about it a lot, while hyphens are so ordinary as to not be a big deal or worth incurring costs like stretched lines to minimize.)

But if a lot of the instances aren't clearly better even by your assessment, you can see why we wouldn't be too eager to go out of our way to add & debug fairly exotic new CSS to opt into this brand new, possibly buggy, non-standardized-cross-browser, opinionated package of changes, when there's just one part we are sure we want.

And so I am simply mentioning, in the context of the discussion of whether to add knobs, that we would like a knob for that part, and would not like to have use pretty as a take-it-or-leave-it deal, because as it stands now - we would have to leave it.

gwern commented 7 months ago

One further thing I would mention after looking at the mobile screenshot pairs too, which look better than I expected: https://share.obormot.net/temp/text-wrap_screenshots_mobile.zip It's hard to predict what Chrome pretty currently does!

I've stared at many of these, but still can't look at the 'before' and predict what will happen in the 'after', particularly how the word-orphans are treated. As best as I can tell, word-orphan fixes are strictly subordinate to the hyphenation changes: it doesn't seem like the word-orphans ever get changed unless there is a previous hyphenation change already being made, regardless of how easy & straightforward a word-orphan change might be. It seems like pretty treats word-orphans as an add-on or afterthought, to be modified only if it's already doing a change, otherwise, not modified at all...?

This is confusing, and I don't think users understand it - I don't recall any of the people advertising pretty as 'solving your word-orphan problems` on Twitter/StackOverflow/Reddit as including the caveat 'but only if that word-orphan is part of a paragraph whose hyphenation is being changed already, otherwise all your word-orphans are still there'. And I don't think anyone would request a feature like "fix my word-orphans but only some of the time, dependent on a usually unrelated problem being fixed". ("What Would Knuth Do?" Probably not that.)

frivoal commented 2 months ago

Agenda+ to give the WG a heads up that I've implemented the resolution in https://github.com/w3c/csswg-drafts/issues/3473#issuecomment-1646267487 by adding text-wrap-style: avoid-orphans.

w3c / csswg-drafts

[css-text] Preventing too-short final lines of blocks (Last Line Minimum Length) #3473