go-text / typesetting

High quality text shaping in pure Go.
Other
88 stars 11 forks source link

Float representation #8

Closed benoitkugler closed 1 year ago

benoitkugler commented 2 years ago

I would like to discuss the choice of the actual representation for floats. I think there is two options : float32 or fixed.Int26_6. The x/image/font/sfnt package uses fixed.Int26_6 but I'm not sure I understand why. Do we really need a fixed point representation ?

From what I've understood, the C libraries Harbuzz and Pango uses regular float (or double), and it seems that Fyne also favors float32.

It is not a fondamental question, but using float32 would simplify the implementation of go-text/shaping and go-text/font.

whereswaldon commented 2 years ago

I honestly have never worked with fixed-point representations before, but my understanding is that they are faster? I really don't know how much faster though. If harfbuzz is using floats, I imagine that we can get away with it too.

The general consensus seems to be that fixed point no longer offers significant advantages unless the hardware lacks an FPU.

sbinet commented 2 years ago

I was under the assumption it was also for accuracy? (But 6bytes for the fractional part is probably too few).

Perhaps @nigeltao is the person to ask for input (or pointers) about the rationale for using fixed-point types in x/image/font.

andydotxyz commented 2 years ago

Fyne uses a real float (float32) so that it is more managable for developers using the library - fixed point precision is hard to work with in comparison. If this library is to be purely internal (i.e. behind toolkit libs) I guess it does not matter. I agree that fixed is expected to be faster, and text is notoriously slow so this may be worth the optimisations - or may find that it's negligible alongside the actual maths of text measuring...?

sbinet commented 2 years ago

I guess we could devise a little benchmark with x/image/font.Drawer.DrawString("...") and see what gives?

nigeltao commented 2 years ago

One reason for fixed point is that computing a floating point expression can give different results on different hardware, even for the same lines of Go code. This fact complicates writing "compare to golden output" unit tests:

Another reason is that fixed point computation can still be noticably faster than floating point, with or without SIMD:

nigeltao commented 2 years ago

It's not Go related, but Dolphin had a recent bug report (that became an interesting story) that came down to FMA (Fused Multiply-Add) and how, in floating point, +0.0 is different from -0.0.

stuartmscott commented 2 years ago

Fixed point is slightly more difficult to work with, but many of the existing font & text libraries in Go already use it so go-text/shaping#1 used fixed.Int26_6.

Any performance gains should be considered carefully - this code is likely to be called at least once for measuring, and once for rendering pretty much every frame, and at 60 fps we only have ~16ms to play with.

benoitkugler commented 2 years ago

Thinking of it a bit more, I noticed that Harfbuzz express the positions of the output glyphs in integer coordinates. It makes sense because there are scaled by a user provided scale parameter, with the following formula : outPosition = scale * fontUnit / faceUpem (For instance, the width in font units of a glyph is typically around 500, with faceUpem = 1000 (or 2000).)

As a consequence

All that to say that we should maybe consider the same approach : express Advance(), Baseline(), Bounds() as integers (int32 say), as well as Input.Size(). This would solve the question of float representation since the bulk of the operations would then actually be performed on (true) integers.

andydotxyz commented 2 years ago

Scaling up and down may be OK for the calculations, but won't that be problematic when rendering the output? As far as I can see it would create text too large that then needs to be scaled back, which will create graphical artefacts.

whereswaldon commented 2 years ago

@andydotxyz I could be wrong here, but I don't think that expressing Advance, Baseline, or Bounds as integers would impact rendering in any way. At the end of the day, all three of those values are referring to pixel coordinates. Advance is how far the text rendering dot advances when displaying this output. I don't think that the dot generally advances by partial pixels. Similarly, Bounds is (for raster toolkits) the dimensions of the output texture in pixels. It can't be partial pixels there either. I also can't envision how a fractional pixel baseline would be useful.

I think that all three of these factors are ultimately about positioning the text on screen, not about rendering it, which is why making them integers is probably safe. That being said, I'm a novice at all of this.

Using the fixed representations has been a huge pain and the source of several errors in go-text/shaping#5, but that's not a great argument for getting rid of it.

andydotxyz commented 2 years ago

I don't think that the dot generally advances by partial pixels.

I am no expert on this, but if it is only ever whole pixels then why does the golang.org/x/image/font use Int26_6 for this and other values? That package also uses fixed.Rectangle26_6 for bounds, instead of int based rectangles.

I guess they are not using pixel based values, so should we be matching them for a better drop-in replacement instead? The question perhaps is whether shaping is about a font or about a rasterised output. The latter could be pixel based, but the former probably should not. In Fyne the number of pixels a font requires can change over the life of a window if it moves monitor, for example, or if the user changes scale parameters.

nigeltao commented 2 years ago

golang.org/x/image/font uses Int26_6 so that it can represent sub-pixel positioning (where the dot can advance by partial pixels).

http://agg.sourceforge.net/antigrain.com/research/font_rasterization/ is one article about SPP. There are undoubtedly others.

whereswaldon commented 2 years ago

@nigeltao Thanks for lending your expertise here! That was an informative read.

However, as @benoitkugler points out above:

I noticed that Harfbuzz express the positions of the output glyphs in integer coordinates.

\<snip>

[if] you need a higher precision, you just give a higher scale factor and you divide back afterwards.

Our text shaper does not emit fractional values for any of these parameters. If a given toolkit wants to invoke things with higher resolution to take advantage of this, it simply needs to scale the ppem appropriately. It seems silly to me for our output format to be in a non-integer unit when we know that the output data itself will always be an integer.

andydotxyz commented 2 years ago

However, as @benoitkugler points out above:

I noticed that Harfbuzz express the positions of the output glyphs in integer coordinates.

If we follow this then the whole of go-text may become harfbuzz specific, at which point creating these abstractions seems unnecessary. I think we need to, as a group, decide if we are building the abstractions so the implementation is a hidden detail, or if we should just depend on the harfbuzz APIs, tie ourselves to that and save a lot of work.

sbinet commented 2 years ago

just a drive-by comment: I was considering using go-text APIs for star-tex (an attempt at a pure-Go TeX engine). IIRC, TeX (at least when using type1 fonts and DVI output) is using something like Int12_20 for font metrics.

npillmayer commented 2 years ago

I'm late to the party, so everyone is fully entitled to ignore my opinion.

I'm not sure the early decision to move to a fractional unit system will support a larger vision of go-text/shaping.

If we follow this then the whole of go-text may become harfbuzz specific, at which point creating these abstractions seems unnecessary.

IMHO this mistakes a property of Harfbuzz as an implementation quirk instead of a more fundamental domain challenge. As I wrote today on the Slack channel:

Usually there’s are several layers of domains with typesetting. Shaping and rendering live in different domains/spaces: the outline-font is a creature of the design space, while a UI is in the space of rendering. Both have different views on what ‘precision’ means. Usually the design space operates with a (much) higher precision. That’s reflected by using an integer type, while render space has to deal with fractions (there’s usually more design units than pixels, which makes quantizing necessary: font-hints, rounding, …)

The answer, that go-text/shaping is an abstraction on top of Harfbuzz is valid, but unfortunately does not help. Opting for fractional values still is a move that influences later stages of typesetting pipelines in an unfortunate way.

What currently is taking shape (sorry for the pun) in go-text/shaping probably is correct for UIs: the quickest path along font->shaping->line wrap->render. That's what @nigeltao has demonstrated impressivly clever in the Go x/font section. And what you guys have accomplished is pretty cool.

A more general typesetting pipeline, however, will include a number of additional steps, which are better carried out in design space, i.e. with "infinitesimaly small integer units". TeX uses 1/65.000 of an inch. Fonts in the wild may well choose a design grid of 4000 units. Having to use fractions is an early descent into the harsh realitiy of limited pixel resolution. If it's of any use I can elaborate on this, but after all I'm more or less an idiot when it comes to graphic UIs (I'm more of a CLI and backend guy).

andydotxyz commented 2 years ago

What you say makes sense, we basically have a choice between integer and scaling up and down, or float with the potential "harsh reality", though I don't fully understand what those harsh realities are.

IMHO this mistakes a property of Harfbuzz as an implementation quirk instead of a more fundamental domain challenge.

If taken in that context alone I suppose that is true, but elsewhere we saw that pango and others use float/fractional so that was made me think it was implementation specific. And as noted at the top the Go packages that existed in x/image/font seemed to be Int26_6. If we want to use implementation details in discussion of an abstraction like this we probably need to compare 2 or 3, and as you say the should come from the same domain.

One thing is surely true - we need to be fully int or fully fractional, what I was wanting most to get fixed was the inconsistency in some of the PRs.

benoitkugler commented 2 years ago

@npillmayer Thanks for your input ! Your point about the internals of a full shaping pipeline is interesting. We could probably adopt the following scheme : keep integer representation as long as it is possible, and convert to floats at the highest level of the pipeline, so that UI toolkits can consume it the easiest way. For now, it is somewhat the case, in the sense that we have one internal layer (Harfbuzz) and one exposed layer (Shaper). We should keep your advice in mind when adding more internal layers.

npillmayer commented 2 years ago

Don't get me wrong: going Int26_6 all the way will certainly work. It may just be more inconvenient to do the layout work that way. As the layout-task for UI is simpler, consistent fractional values all the way may well be the right thing.

whereswaldon commented 2 years ago

Don't get me wrong: going Int26_6 all the way will certainly work. It may just be more inconvenient to do the layout work that way. As the layout-task for UI is simpler, consistent fractional values all the way may well be the right thing.

It would be a shame to prevent go-text from being used in typesetting contexts because of this API decision. I agree with @benoitkugler that perhaps the right thing is to maintain high fidelity until the GUI API boundary, and to potentially provide a different API surface for typesetting applications that want the granular control.

andydotxyz commented 2 years ago

To be able to know what the right way forward is I think we will have to define the areas of responsibility of each repository in this project. We have also discussed whether they should all be merged to one, which I think makes this even more complicated.

Should we clarify the aim of each area and the types of code that will use them? From my perspective this was all about getting better text rendering so I am lost with all of the different layers and which parts of go-text would focus on other use-cases.

npillmayer commented 2 years ago

Should we clarify the aim of each area and the types of code that will use them? From my perspective this was all about getting better text rendering so I am lost with all of the different layers and which parts of go-text would focus on other use-cases.

Not trying to intervene, but I put out a blog post which may or may not help defining the context of go-text:

What is a Typesetting-Stack?

whereswaldon commented 2 years ago

Thanks @npillmayer for the write-up. To borrow terms from there, I think go-text should aim for:

andydotxyz commented 2 years ago

Just a few thoughts/questions:

whereswaldon commented 2 years ago

Authoring: if we accept an []Input that may, or may not, be single-direction runs then it seems we will have to parse the string to check. Would it not be better to specify that either it is or is not inclusive of this processing? Maybe you meant that we should internally handle the bi-di parsing? This is a change of scope from earlier discussions, though not something I am against.

Yeah, sorry I was unclear. Higher-level code will need to give us styled runs of text, each represented as an Input. It's up to us whether they're responsible for doing bidi during the creation of those runs or not. I'd tentatively suggest that we offer a function that accepts []Input that are not guaranteed to be single-direction, and that we apply the bidi algorithm to yield []Input where every element is single-direction. Toolkits can choose to invoke this helper (or not) as a preprocessing step before using the shaper.

I totally agree that we don't want to add logic to the shaper that tries to verify that each Input is single-direction.

Line breaking: We have an implementation for this in Fyne, but unfortunately it cannot be contributed because the original authors are not here to donate it to the public domain license.

Does your line breaking algorithm handle RTL text? I think we'll definitely need one that does, which is why I've been working on one. I've been meaning to PR it into here, but I haven't gotten to it yet.

Rasterisation: I don't understand quite why this is application or toolkit dependent - can you explain what you mean here please? I had thought that going from text vectors to pixels is a pretty standard operation - there is a golang.org package that seems to manage it without platform considerations?

This is common, sure. It's just that there's a rats-nest of complexity in sub-pixel hinting, stem widening, gamma correction, and other parameters that applications might want to make different choices on. I'm okay with this being in scope for go-text if everyone sees it as a common need, but it's an area in which it seems difficult to create the "one rasterizer that will serve all usecases". Maybe that's okay though. I don't have strong feelings on this. I suggested that it might be out of scope before in an effort to simplify the scope of the overall project.

andydotxyz commented 2 years ago

Does your line breaking algorithm handle RTL text? I think we'll definitely need one that does, which is why I've been working on one. I've been meaning to PR it into here, but I haven't gotten to it yet.

No, not yet - this go-text project is our exploration into RTL land. We have not made commitments to have it delivered until a future release that is not yet scheduled.

I suggested that it might be out of scope before in an effort to simplify the scope of the overall project.

Given the aim of this project was to create a single place that Go projects could handle text in a graphical context I'm not sure it would be easy to say that rendering the pixels is out of scope. It could be a different repository/package but it really feels in-scope for go-text

eliasnaur commented 2 years ago

I suggested that it might be out of scope before in an effort to simplify the scope of the overall project.

Given the aim of this project was to create a single place that Go projects could handle text in a graphical context I'm not sure it would be easy to say that rendering the pixels is out of scope. It could be a different repository/package but it really feels in-scope for go-text

Several graphical contexts don't need rasterization: SVG/PDF/PS output, web , Gio are some examples. To be fair, Gio does need rasterization, but a Go implementation is not efficient enough, and our GPU implementation is very likely not in scope for go-text.

whereswaldon commented 2 years ago

Let's simply plan to provide a raster package for that purpose. I don't really know what we'll put in there, but it's definitely fine to offer it.

andydotxyz commented 2 years ago

I think I would prefer that, thanks @whereswaldon. I see that Gio and some applications won't use a rasterizer, but I am pretty sure that many others could want one. Even if Fyne moves to vector fonts we will still want to rasterize in software for unit tests etc :).

whereswaldon commented 1 year ago

I think this issue wandered quite far from the initial question, and I don't know if there are any outstanding todos from this conversation. We have a rendering package now, and we use fixed point values in the API. I'm going to close this for now.