Closed litherum closed 5 years ago
Do you think there is anything in the specification currently preventing browsers from implementing paragraph-level layout already? I know they don't, but if they could, maybe we don't need an opt in at all, and any browser willing to go through the necessary implementation complexity can just turn it on by default.
I think he meant it to be an opt-in, since turning this on will hit the performance quite severely. Adobe InDesign has a concept of pluggable composer, and provide 4 different composers; combinations of English/Japanese and line/paragraph levels. AFAIU, the composer in InDesign provides line breaker and justification algorithm.
I like the idea, and I hope @astearns as well. I prefer a separate property, something like text-rendering, but I'm open to other ideas too.
This is exciting and something I'd be very supportive of. Given the performance concerns though it'll likely have to be opt-in as suggested by both @litherum and @kojiishi. It might also be possible to implement support of this on top of custom layout to allow for custom pluggable composers.
Do you think there is anything in the specification currently preventing browsers from implementing paragraph-level layout already?
@kojiishi is correct. It would be incorrect to do this automatically for both performance and correctness. Opting everything in automatically would be a performance regression, and the web would be upset if we just start changing every line break on every page to be dramatically different.
Line breaks aren't controlled directly by CSS anyway - line breaks change browser to browser and platform to platform and that's OK. So I think it would be fine for browsers to incrementally improve their line breaking without an opt-in.
But I agree that the performance characteristics of full-paragraph composers are likely to require an opt-in. InDesign doesn't actually go full-paragraph, it only does a set number of lines at a time. I think browsers should be able to experiment with composers, and even disagree about what works best. So an opt-in would probably need to be pretty generic, perhaps a single-line composer as a default and a multi-line composer as the opt-in.
Point taken about performance. However I think if we introduce a switch, it should be between auto and on, not between off and on: CSS UAs generally do greedy line breaking as the default, and that's ok, but if an implementer wants to provide a better default (e.g. it is a print UA not concerned with performance, or it has a not-great-but-better-than-greedy algorithm that performs fine...), that should be allowed.
With that in mind:
auto | on
or auto | on | off
, where off
explicitly requires greedy line breaking? I don't think off
is needed, but I could be wrong. Authors do not know enough about the performance tradeoff that this implies to make an informed decision (other than by assuming a particular version of a particular browser), and if they need to depend on a certain wrapping behavior, there's white-space: pre
Houdini should not be required to have beautiful paragraphs.
I also don't see a use case for off
. Making auto
the initial value is a good idea so browsers can improve their line breaking over time.
Eventually, adopting more customizability for smart paragraph breaking is totally reasonable in a later level. However, it's too early to spec something like that in this level. However, the keyword "on" is compatible because, in later levels, we could list additional keywords which would appear after the "on" value.
It sounds like Florian is leaning toward a new property. Therefore, the current proposal is:
paragraph-layout: auto | on
initial value: auto
Obviously we can bike-shed the name too.
Right, that's kind of what I have in mind. At the same time, you suggestion to use text-wrap is quite reasonable also, as it seems the separate property would not do anything when text-wrap
is nowrap
or balance
. so we could have a new value along the lines of text-wrap: smart
or text-wrap: smart-wrap
or text-wrap: contextual-wrap
text-wrap: paragraph-wrap
...
So, to decided if we need more values to the existing property or a new one:
:root { text-wrap: smart;}
and selectively apply nowrap where it belongs.All in all, I think I could go either way, but I now lean a bit more towards a value to text-wrap. And given that text-wrap has not been implemented anywhere yet I think, we can bikeshed the property name and existing values if needed to make everything fit better together.
I'd like to take my comments above "prefer a separate property" part back; i.e., I'm fine with a new value to text-wrap
.
I see @litherum already expressed that either is fine with him, @frivoal @astearns do you have preference?
I admit I had wrong understanding on how text-wrap
is defined currently, but now I understand it already behaves differently for inline and block.
So the block-level text-wrap
chose line breaker while inline-level text-wrap
turns on/off wrapping looks reasonable to me.
And given @astearns's suggestion above, keyword should be multiline
rather than paragraph
? Or do we prefer even more generic name, such as smart
as suggested by @frivoal?
I missed the last paragraph of @frivoal so he already prefers a new value to text-wrap
, so we're all good?
I think we're all good.
"Smart" is not a good name because it doesn't have any semantic meaning. I like Koji's "multi-line" idea because it is more accurate than "paragraph" (because I expect most implementations will do this in a sliding window rather than for the entire paragraph)
So the new proposal is: text-wrap: wrap | nowrap | balance | multi-line initial: wrap where "multi-line" is a synonym for "wrap" when applied to inline elements. Similar to "balance," the exact algorithm is UA-defined.
The conceptual difference between "balance" and "multi-line" is that "balance" is intended for titles where "multi-line" is intended for body text.
I don't think we need conf call time on this - I'll just edit the value in
Alan, I think this does need a WG resolution to add. It's not a bugfix, and it's not a trivial feature, either.
The proposal sounds sane to me. I’d like also to make it clear that browsers should be allowed (but not required to) make wrap
do the same kind of multi-line processing as multi-line
. The alternative being requiring that wrap
invokes the greedy line breaking behavior, which I don’t think would be good.
Basically, I think that the two values should have the following meeting:
wrap
: lines do wrap, algorithm is UA defined, UA may take multiple lines into account, UA may bias for speed over good layoutmulti-line
: lines do wrap, algorithm is UA defined, UA should take multiple lines into account, UA should bias for good layout over speedAnd I agree with @fantasai that adding new values is typically something we should ask the WG about, even though this is an early draft. The WG might not yet need to discuss all the details, but it should be in the loop.
@frivoal To properly support multi-line
, implementations must consider multiple lines. If they do not support the value they will fall back to wrap
anyway.
Well, to do anything useful, sure, but since the value implies a UA defined algorithm, they can do whatever they want anyway, and we cannot test the difference. So putting must on a non testable statement isn't doing very much.
If an implementation does not support the value, it will fall back to whatever the cascade says it should fallback to, which may or may not be wrap
depending on how the stylesheet is written.
Allowing implementations to support multi-line
while not doing anything smart (but recommending that they do something smart) may provide a more robust fallback story: when a browser knows the value, even if it doesn't have any particular smart line-breaking logic, it falls back on something that wraps, rather than on whatever the cascade says.
It will probably be possible to come up with a very basic test - given a width constraint of about ten characters and some content that looks like:
a a a a a bbb ccccccccc
a multi-line composer is going to choose to break before the last 'a' in order to avoid the short second line length.
And I disagree that we need to get a resolution on a conference call to add something to a working draft. I see enough consensus in this thread to make the change. Informing those in the group not following this thread that there's something new is worthwhile, but that's best done with the edits in place. Asking anyone to resolve on something that's not yet written down isn't fair. If there is anyone following this thread who would object to the edit on technical merits, please speak up.
It will probably be possible to come up with a very basic test [...]
It ought to, but this particular situation might cause a heuristic fluke. I suppose if we mark that test and similar ones as "should", that's alright.
Informing those in the group not following this thread that there's something new is worthwhile, but that's best done with the edits in place.
Fair enough, I can buy into that.
I had a proposal adopted by CSS at one point to add a couple of properties for this...
One was to say whether you expect text to be editable in the future. The reason for this is that some line-breaking algorithms are not suitable for interactive use. They might be slow, or, worse, adding a word might affect the position of the insertion point, moving it backwards or forawrds by one ore more (horizontal or vertical) lines. So you need a hint to say, although this text isn't marked as editable now, the app might change that, and if it does, don't reflow the page to make it editable!
The other was to let designers specify a preferred algorithm and give parameters to it; this might be better left not done for now, because I think we need experiments.
My own research in the past has suggested the best compromise in many circumstances is a modified first-fit that operates on an n-line window. This works massively better than Knuth-Plass for unattended operation because it doesn't have the poor edge-case behaviours of Knuth-Plass found e.g. in TeX.
Liam
One was to say whether you expect text to be editable in the future. The reason for this is that some line-breaking algorithms are not suitable for interactive use. They might be slow, or, worse, adding a word might affect the position of the insertion point, moving it backwards or forawrds by one ore more (horizontal or vertical) lines.
If we want to follow through with this, we either need to define the initial value (wrap
) to be that value, or to have 3 values:
wrap
?)multi-line?
)auto
?)My own research in the past has suggested the best compromise in many circumstances is a modified first-fit that operates on an n-line window. This works massively better than Knuth-Plass for unattended operation because it doesn't have the poor edge-case behaviours of Knuth-Plass found e.g. in TeX.
That's for the multi-line
case where we're trying to get the nicest result, right? This sounds like an area where we should allow browsers to do whatever innovative thing they want, but we could still define this as a non normative suggestion of how to implement it.
For the fast-and-stable case, should we just leave it up to browsers, or require a particular approach, or or leave it up to browsers and suggest particular approach? If we suggest/require something, should it be greedy line breaking, or some variant of greedy line-breaking that allows for prioritization of soft wrap opportunities, or something else?
I don't have yet a strong opinion on what the best answer is, but I'm leaning toward auto | wrap | multi-line
(or wrap [fast | nice]?
or wrap [stable | nice]?
):
auto
(wrap
in the alternative naming):
wrap
(or wrap fast
or wrap stable
in the alternative naming)
hyphenate-size-limits
or a similar property generalizing the concept, the UA may choose which of several wrap opportunities it wants to break on.multi-line
(or wrap nice
in the alternative naming)
On Tue, 2016-11-29 at 17:23 -0800, Florian Rivoal wrote:
One was to say whether you expect text to be editable in the future. If we want to follow through with this, we either need to define the initial value (
wrap
) to be that value, or to have 3 values:
- one that's guaranteed stable and fast (
wrap
?)- one that promises nice layout (
multi-line?
)- an initial value that lets the browser pick where it wants to be on the stability-performance-niceness spectrum (
auto
?)
I think that future-editability should probably not be conflated with wrapping. An alternate approach might be for HTML to offer values for content-editable that included "never" (read-only DOM subtree), "scripted" to mean it could happen, and "yesbaby". But that makes it harder to write an application for editing arbitrary documents. So previously I'd proposed a separate property for it.
But yes, all I really care about here is that browsers can improve paragraph layout without screwing up Web editing applications.
My original proposal: https://lists.w3.org/Archives/Public/www-style/2013Mar/0183.html
As adopted: https://lists.w3.org/Archives/Public/www-style/2013Apr/0246.html
I didn't do this in the end mostly because of Houdini (and talking to people) but looks like it's time.
What's important seems to be (1) authors can demand a stable algorithm for an element/subtree so that starting to edit doesn't cause a reflow;
(2) page authors/designers can express that good line-breaking is important
(3) people can experiment with algorithms (e.g. polyfills, prefixed values) until we learn better what works. There's been a lot of research in the past on this stuff, but the area of automatic layout (i.e. the page author isn't there to make changes and try again) combined with interactive screen display makes line-breaking very different than for paper, and very different from e.g. TeX's environment.
[...] modified first-fit that operates on an n-line window.
That's for the
multi-line
case where we're trying to get the nicest result, right? Yes. It's almost as fast as simple wrapping, slightly worse than linear on the number of words in the paragraph, but also slightly harder to code :-) as you need to handle n previous lines (an n of as small or 2 or 3 makes a huge improvement over "ifrst fit/greedy", though).For the fast-and-stable case, should we just leave it up to browsers, or require a particular approach, or or leave it up to browsers and suggest particular approach?
I think the latter. As Fantasai said when it was discussed before, authors should not have to opt in to higher quality.
I suggest an n-line optimization. It's fast and does a good job, but it is not monatonic: adding a word at the end (or anywhere) can in some cases reduce the number of lines in a paragraph. That means the insertion point can jump backwards as you're editing.
That's only mildly disruptive on a graphic designer's 30" page layout display but no good on a mobile device or in a Web app where there might be limited display space for the text, hence wanting for authors to be able to opt out. Another approach that's been tried is to add a delay, so you don't reflow the paragraph while it's being edited, but that's a nightmare for copy-fitting and also has accessibility problems for people who lose their place when it does finally reflow.
Note that the TeX Knuth-Plass algorithm is worse than polynomial (they say it's NP-complete) on the number of words in the paragraph, which might be risky for a Web browser.
[...]
- the algorithm is UA defined. It should bias towards nice layout, if necessary at the expense of speed and stability. Liam's favorite suggestion is offered in a note as one reasonable approach to do that. OK.
Liam
Hmmm. I'm a bit confused. What you're saying seem to be arguments supporting what I proposed (“leaning toward[...]”) or something close, but I can't really tell if you are indeed supporting it or explaining why you think we need something else.
(1) authors can demand a stable algorithm for an element/subtree so that starting to edit doesn't cause a reflow;
that'd be text-wrap: wrap
, for which we'd require greedy line breaking
(2) page authors/designers can express that good line-breaking is important
That'd be text-wrap: multi-line
, for which we'd recommend the algo you're suggesting, but also allow UAs to do other stuff (Knuth-Plass or what have you).
As Fantasai said when it was discussed before, authors should not have to opt in to higher quality.
That'd be text-wrap: auto
, the initial value, which would the UA could do anything, but we could suggest doing the same as text-wrap: wrap
on editable elements and the same as text-wrap: multi-line` elsewhere.
[...] because of Houdini [...] & people can experiment with algorithms (e.g. polyfills, prefixed values) until we learn better what works
What I suggested doesn't really do anything specific about that, but it seems perfectly compatible with doing that via Houdini.
I've added multi-line with a note on intent and another on the need to solve the editing issue. I'm closing this issue, but please add new issues as needed if there's anything I've missed from this thread or any problems with what I added.
@astearns I have two issues with your edit. It seems to ignore the part of the discussion after Liam chimed in.
You did not include an informative note giving Liam's concrete suggestion of a better algorithm for multi-line
. I think this would increase the likelyhood of implementation by giving a starting point to implementers who're willing to give it a shot without being experts on the topic, so it’s worth including.
The discussion above suggested that explicitly opting into stable layout was desirable. I proposed doing that by switching the initial value to auto
, and requiring a stable algo for wrap
. The way you specified it does not include that (and would prevent us from doing it later, since you have the same value with different semantics), so I suppose you disagree. I know you have a note, but if anyone implements the spec as it is now, that only leaves the possibility of a new property open, not the one of a different set of values.
Unfortunately, you haven't replied to the comments proposing these, so it is difficult to know from what angle to argue.
I can open new issues and repeat the argument there if you want, but at the same time, the context is here, so it seems easier to discuss here.
I agree with Liam that "future-editability should probably not be conflated with wrapping," and more strongly that editability concerns should not constrain wrapping choices. In some cases you want to edit with stable upstream line breaks, and in other cases you want to edit with the best line breaks over all the content. So a separate property expressing editing preferences is warranted. It should be possible to satisfy both text-wrap:multiline and an edit preference set to stable by starting an n-line window with the line just above the cursor.
I didn't consider adding a note with algorithm suggestions. I expect that Liam's n-line optimization is the right suggestion, but maybe someone can come up with something better. Perhaps a separate issue would help?
In some cases you want to edit with stable upstream line breaks, and in other cases you want to edit with the best line breaks over all the content.
I agree.
So a separate property expressing editing preferences is warranted.
I don't see how this follows from that. A separate value would do as well.
I didn't consider adding a note with algorithm suggestions. I expect that Liam's n-line optimization is the right suggestion, but maybe someone can come up with something better. Perhaps a separate issue would help?
Reopening for discussion. I stand by the position that new values here should get WG discussion and consensus. Consensus in an issue is enough only if the only people who care are paying attention in the issue, and I don't believe that's the case here and it usually isn't for adding new features.
I'm trying to get my head around where/when/if text-wrap
interacts with text-align
and text-justify
, and hyphenation for that matter. Let's say a browser is capable of implementing something like Knuth–Plass multi-line justification, my understanding of the proposal is that I'd need to do this:
p {
text-align: justify;
text-wrap: multi-line;
}
Because line breaking and multi-line justification methods are inherently linked, I turn on justification with text-align
but request the method using text-wrap
.
That said text-justify
is also available to say whether the justification method separates words only or can also separate characters. The text-justify
spec as written also goes on to say:
The guidelines in this level of CSS do not describe a complete justification algorithm. They are merely a minimum set of requirements that a complete algorithm should meet. Limiting the set of requirements gives UAs some latitude in choosing a justification algorithm that meets their needs and desired balance of quality, speed, and complexity.
...which to my mind implies that text-justify
could be expected to specify the justification (and hence) line breaking algorithm in the future.
I think what I'm trying to ask is: should text-wrap
affect the way text is justified? If so, this implies that text-wrap
determines the justification algorithm (perhaps in conflict with text-justify
).
The Working Group just discussed Allow for paragraph-level line breaking
.
On Wed, 2018-04-11 at 09:02 -0700, CSS Meeting Bot wrote:
The Working Group just discussed
Allow for paragraph-level line breaking
.
Just a couple of points to add that may help: (1) Knuth-Plass is not suitable for content-editable text without a LOT of work on UX, because editing a word even slightly might make the text format into a different number of lines and move the insertion point somewhat distressingly. It's been done e.g. years ago in InterViews, but it's really best just to use the normal first-fit algorithm for editable content.
I'd suggested at one point a property to say that an element's content might be edited in the future even if it doesn't contain content- editable, so that setting content-editable wouldn't need to trigger a reflow. that suggestion was adopted by the CSS WG bt never made it into text 4, and i didn't push because of the Houdini work just getting started.
(2) if you use first fit with a 2-line buffer, and move a single word down from a tight line onto the next line if the next line is looser is really super fast, makes a huge improvement, and plays much more nicely with content-editable.
(2b) you can extend the 2-line buffer to n-lines and do a better job of averaging out the line lengths (or the space sizes for justified text), but it's not neccessary to use large values of n (7 is probably enough) especially if it's a floating window (lines 1...n, then 2...n+1, and so on, or move down by max(n/2 - 1, 1) lines each time) and it's still linear on the number of lines, where Knuth-Plass is NP-complete on the number of words in the block and can quickly get expensive.
-- Liam Quin, W3C, http://www.w3.org/People/Quin/ Staff contact for Verifiable Claims WG, SVG WG, XQuery WG Improving Web Advertising: https://www.w3.org/community/web-adv/ Personal: awesome vintage art: http://www.fromoldbooks.org/
I'm currently experimenting with doing paragraph-level line-breaking in userland in the tex-linebreak package. There is a bookmarklet that you can use to try it out on existing content.
I'm using the Knuth-Plass algorithm with only some small modifications. The main issues and sources of overhead I've encountered with a user-space solution are:
<p>
in the Wikipedia article on London ). In either case, the library is repeating a lot of work that the browser has already done.What I'm planning to do next is to take a look at the CSS Layout API to understand whether it is possible to use that to reduce the duplication of work.
@liamquin Thanks very much for your detailed comment on line-breaking algorithms in https://github.com/w3c/csswg-drafts/issues/672#issuecomment-380902415 ! Do you have any good references we can link to from the spec, or should I try to summarize your points in a note?
On Wed, 2018-08-08 at 18:14 -0700, fantasai wrote:
@liamquin Thanks very much for your detailed comment on line-breaking algorithms in https://github.com/w3c/csswg-drafts/issues/672#issuecomment-380902415 ! Do you have any good references we can link to from the spec, or should I try to summarize your points in a note?
I don't have references i'm afraid. I've never seen any on the approach i describe - i came up with it myself based on how hand composition worked, but i'm sure many other people have too. I understand that InDesign uses an n-line moving window with n probably 9 or so.
I certainly don't mind reviewing a note, but no longer have anyone paying my way to participate in the CSS WG.
There are plenty of references on problems with Knuth-Plass and interactive editing, and people have tried lots of experiments to try and make it work, but the combination of (1) it not being stable (the insertion point moves around distressingly) and (2) corner cases that for TeX-for-print the author has to correct by hand, make it not really ideal. In an editor the insertion point can be kept stable most of the time by not reflowing the paragraph immediately, e.g. not until a paragraph loses focus, and that compromise might work in a browser. It's mostly a problem when you are editing in the middle of existing text.
Knuth and Plass published a paper, probably early 1980s, proving their modified algorithm was NP-complete.
-- Liam Quin, https://www.holoweb.net/liam/cv/ Web slave for vintage clipart http://www.fromoldbooks.org/ Available for XML/Document/Information Architecture/ XSL/XQuery/Web/Text Processing/A11Y work & consulting.
The CSS Working Group just discussed text-wrap: multi-line
, and agreed to the following:
RESOLVED: Add the value back in
RESOLVED: add the stable value
Prince has a CSS property that provides some control about linebreaking.
prince-line-break-choices: body | heading | title | body-lookahead | heading-lookahead | title-lookahead | fast
Edited in. @dauwhe Please file new issues if there's other things we should add?
There are a few algorithms which attempt to choose line breaks within a paragraph in order to maximize the beauty of the paragraph as a whole. Such algorithms try to do some subset of:
Obviously, it is impossible to satisfy all these desires simultaneously for arbitrary paragraphs. The exact algorithm should not be specced for these reasons:
Instead, there should be a way for a web author to opt-in to paragraph-level layout for beautiful paragraphs.
I'm not sure what the best mechanism for this is. Perhaps a new value to the text-wrap property? Perhaps a new property? Perhaps something else?