go-text / typesetting

High quality text shaping in pure Go.
Other
88 stars 11 forks source link

How to implement line wrapping parameters? #28

Closed whereswaldon closed 1 year ago

whereswaldon commented 1 year ago

So I think we're clearly going to want to be able to configure the line wrapper's behavior. Things like:

I'm not asking for an inventory of these parameters or anything, but rather for thoughts on how they should be provided. My inclination is to add something like:

type WrapConfig struct {
    BreakWordPolicy // Type encapsulating the heuristics for how to break within a word
    Hyphenate bool
    Hyphen Output // The actual hyphen glyph to use, the result of a previous shape
    // maybe other fields later
}

This could then become a new parameter to LineWrapper.Prepare and LineWrapper.WrapParagraph. That would create a constraint that the configuration must be constant for the duration of a paragraph, but I think that's probably desirable. Wrapping individual lines with different choices seems... strange?

The other obvious API choice would be adding this as a field on the LineWrapper, but I think that makes reusing a single LineWrapper instance for many different pieces of content unnecessarily hard.

I think the most urgently needed option here is enabling the ability to break within a word. Right now Gio simply can't do it (because we don't surface it), so strings in extremely narrow spaces are just visually truncated by-word. You get to see the beginning of each word in the string. It's pretty sub-optimal. :D

benoitkugler commented 1 year ago

Another possible parameter would be related to ellipsis (that is, add ... and truncate instead of wrapping).

Besides, I'm currently thinking about two possible sources of inspiration: the flutter text API (which somehow enables precise configuration while remaining rather ergonomic), and the semantics defined by the CSS specification. I'll take a deeper look to see if there are concepts we could use here..

benoitkugler commented 1 year ago

Otherwise, I like the general plan

whereswaldon commented 1 year ago

Another possible parameter would be related to ellipsis (that is, add ... and truncate instead of wrapping).

I was thinking that we'd address this use-case by setting the BreakWordPolicy to always break the final word, setting Hyphenate to true, and setting Hyphen to a shaped ellipsis. I think that has the correct outcome, though I'll admit that it may not be super intuitive.

Besides, I'm currently thinking about two possible sources of inspiration: the flutter text API (which somehow enables precise configuration while remaining rather ergonomic), and the semantics defined by the CSS specification. I'll take a deeper look to see if there are concepts we could use here..

That would be super helpful, thanks!

benoitkugler commented 1 year ago

I was thinking that we'd address this use-case by setting the BreakWordPolicy to always break the final word, setting Hyphenate to true, and setting Hyphen to a shaped ellipsis. I think that has the correct outcome, though I'll admit that it may not be super intuitive.

Oh, yes, pretty clever ! In this case, maybe the wording for Hyphen and Hyphenate could be adjusted ? Something like Separator ? To convey the idea that the field is not necessarily a "-"

whereswaldon commented 1 year ago

I was thinking that we'd address this use-case by setting the BreakWordPolicy to always break the final word, setting Hyphenate to true, and setting Hyphen to a shaped ellipsis. I think that has the correct outcome, though I'll admit that it may not be super intuitive.

Oh, yes, pretty clever ! In this case, maybe the wording for Hyphen and Hyphenate could be adjusted ? Something like Separator ? To convey the idea that the field is not necessarily a "-"

I spent part of the weekend thinking about this, and I know disagree with myself. Hyphenation is a typographic convention used to indicate words that have been artificially broken by typesetting. It only occurs when you break a word. Continuation (ellipses) is a typographic convention indicating that there is more text. It occurs regardless of whether a word was broken, and it indicates that there is more (omitted) text.

If we used the same mechanism to hyphenate broken words and to add continuations, we'd either fail to add a continuation if we didn't break a work, or we'd add hyphens to the ends of words that we didn't break. It's probably best to manage these as two distinct concepts.

Perhaps this would be better:


type WrapConfig struct {
    BreakWordPolicy // Type encapsulating the heuristics for how to break within a word
    Hyphenate bool // If we create a break within a word, insert a hyphen afterwards.
    Hyphen Output // The actual hyphen glyph to use, the result of a previous shape.
    Truncate bool // If there is more text at the end of the line, insert the truncation glyph at the end.
    Truncation Output // The shaped glyph to use to indicate truncation, usually an ellipsis
    // maybe other fields later

}
benoitkugler commented 1 year ago

I think there could be interest for clients to specify the hyphen (or ellipsis) as a rune instead of a glyph output. To make their life easier, we could probably export helper functions (or method on WrapConfig maybe ?) to select the glyph and shape it so that clients do not have to. I think it is totally compatible with the proposed API, so it could be added later.

whereswaldon commented 1 year ago

I think there could be interest for clients to specify the hyphen (or ellipsis) as a rune instead of a glyph output. To make their life easier, we could probably export helper functions (or method on WrapConfig maybe ?) to select the glyph and shape it so that clients do not have to. I think it is totally compatible with the proposed API, so it could be added later.

That is certainly more ergonomic, but would require access to a shaping.Shaper. As shaping.LineWrapper doesn't have a shaper as a field, I figured we'd need to do that work in advance, or define a way for the line wrapper to reinvoke the shaper (possibly also useful for situations in which we want to break in the middle of a glyph cluster).

But yeah, let's maybe define:

func (w *WrapConfig) SetHyphen(r rune, shaper Shaper, faces []font.Face)
func (w *WrapConfig) SetTruncation(r rune, shaper Shaper, faces []font.Face)

That would allow the user to provide everything we need to shape it for them with minimal friction.

As usual, I'd be happy for these fields/methods to have better names if you have ideas.

andydotxyz commented 1 year ago

I spent part of the weekend thinking about this, and I know disagree with myself. Hyphenation is a typographic convention used to indicate words that have been artificially broken by typesetting. It only occurs when you break a word. Continuation (ellipses) is a typographic convention indicating that there is more text. It occurs regardless of whether a word was broken, and it indicates that there is more (omitted) text.

I'm glad you said this, I was about to put something like that in here :).

I would be tempted not to include Hypen and Truncation at the moment, do we have use-cases for setting them to custom elements?

p.s. maybe Truncator instead of Truncation - less obvious word but it is noun instead of verb.

whereswaldon commented 1 year ago

I want to be able to create a self-truncating text label in GUI code. Our current line wrapper implementation doesn't make that easy. Let's say I want to display "The quick brown fox", but I only have room for the first 10 or so characters (ignore variable width). Right now, I can shape a single line to get "The quick ". If I want ellipsis at the end, I need to shape them in advance and subtract their width from the available width before wrapping the line. This creates a pathological case in which the entire text would have fit if I hadn't removed space to hold the ellipsis.

Having the line wrapper actually implement the Truncator (it's a better name) allows it to intelligently choose whether or not to reserve space for that symbol at the end of the line based on context it has.

As for allowing the runes used to be customized, do non-latin-script languages use hyphens and ellipses? I assume they have their own continuation and truncation marks. I suppose they may even be multiple code points long, in which case we should accept []rune.

Some quick research indicates that other scripts have wildly different hyphenation rules. I have no idea how to go about supporting a broad range of those, but I think it's clear that always using the latin hyphen isn't a great way forward for broad support of scripts.

Hyphenation is pretty complex, and I honestly have less need for it right now. Truncation, on the other hand, is extremely useful and important for UI elements, and Gio can't do it efficiently right now. I'd really like to find an acceptable approach to implement truncation.

We can even leave aside WordBreakPolicy, as it's probably a rat's nest of complexity all on its own.

How does this sound?

type WrapConfig struct {
    Truncate bool // If there is more text at the end of the line, insert the truncation glyph(s) at the end.
    Truncator Output // The shaped glyph(s) to use to indicate truncation, usually an ellipsis
}

// SetTruncator configures the WrapConfig's Truncator field by shaping the provide runes with
// the provided faces.
func (w *WrapConfig) SetTruncator(r []rune, shaper Shaper, faces []font.Face)

Then shaping.LineWrapper's Prepare and WrapParagraph methods are extended to accept a WrapConfig.

Thoughts?

andydotxyz commented 1 year ago

As for allowing the runes used to be customized, do non-latin-script languages use hyphens and ellipses? I assume they have their own continuation and truncation marks.

I guess what we should do here is find the answer so there are no assumptions influencing the API design. From reading the article you linked it seems that whilst the rules vary a lot the glyph is consistent, when present.

I'd really like to find an acceptable approach to implement truncation.

Yeah that does sound like the more compelling case. The proposed API makes sense, if the Truncator is deemed necessary to do at this time. With the WrapConfig it could be added later? We may find that setting it to an Output is less required than ShowTruncator bool if it turns out to be an off/on thing. But I am no expert on any of this!

whereswaldon commented 1 year ago

I think this issue isn't doing us much good now. We have basic wrapping configuration that we can extend to support new features. It would be best to discuss specific extensions in dedicated PRs/issues.