love2d / love

LÖVE is an awesome 2D game framework for Lua.
https://love2d.org
Other
5.22k stars 405 forks source link

Text width is calculated incorrectly when highpdi is enabled in Love 12.0 #1923

Open modcos opened 1 year ago

modcos commented 1 year ago

The text width is calculated incorrectly when highpdi is enabled. Here are the examples I tested with the default font:

print(font:getWidth('a')) -- 7
print(font:getWidth(string.byte('a'))) -- 8

local rasterizer = love.font.newRasterizer()
print(math.floor(rasterizer:getGlyphData('a'):getAdvance() / love.window.getDPIScale() + 0.5)) -- 8 (7.6 float)

And an example of another mistake:

print(font:getKerning('f', 'f')) -- 0 (Ensure the kerning is zero)

love.graphics.print(font:getWidth('ffff'), 0, 0) -- 18
love.graphics.print(font:getWidth('f'), 0, 10) -- 4

In computeGlyphPositions function string 'ffff' splits into 'ff' pairs, so when calculating the width love code calculates and rounds the value for a pair of characters like floorf((glyphpos.x_advance >> 6) / dpiScales[0] + 0.5f). That's why there is a difference if you calculate each character individually.

slime73 commented 1 year ago

In the first example I believe 7 is more correct than 8 - the text shaper (Harfbuzz) produces an un-rounded value of 7.2. GlyphData:getAdvance and the other variant of Font:getWidth don't go through the text shaper.

I'll do more investigation when I have more time but it's possible the end result to fix this issue will be using the (new) text shaper in more places as well as making APIs to encourage people to use it rather than going behind its back and trying to calculate things manually.

modcos commented 1 year ago

Thanks for the reply. I want to implement ui for text input, text selection and moving the cursor through the text. I don't know if love has a function that can return the character and the position of the character, relative to a given width (from the beginning of the text).

text_offset

Here's an example of what I use glyph data for. For the same reason, I only use getting the size for a single character.

I'm sorry if I don't explain well and make mistakes in the text. I do not speak English very well.

slime73 commented 1 year ago

I don't know if love has a function that can return the character and the position of the character, relative to a given width (from the beginning of the text).

It doesn't yet, but it's planned for 12.0. :)

modcos commented 1 year ago

I figured out the reason why 'ffff' splits into 'ff' pairs in Harfbuzz.

"This is the result of ligature glyph substitutions. OpenType spec recommends that Standard Ligatures be applied by default, that's exactly what Harfbuzz is doing."

Here's more information: https://stackoverflow.com/questions/62552451/why-harfbuzz-shape-2-single-char-into-one-glyph https://harfbuzz.github.io/shaping-opentype-features.html

modcos commented 1 year ago

I found another strange behavior:

local font = love.graphics.getFont()
local _, wrapped = font:getWrap('Приветff', 1000)
print(#wrapped) -- 2
local _, wrapped = font:getWrap('Приветfff', 1000)
print(#wrapped) -- 1
slime73 commented 1 year ago

I found another strange behavior:

Just documenting this here for the future since I haven't figured out the right solution yet: this is caused by the HarfbuzzShaper::computeWordWrapIndex code using harfbuzz' cluster API to get back the original codepoint indices for a string when determining where to wrap, without accounting for clusters that are the result of multiple codepoint characters combined into one glyph (like the ff in the above code). The trouble is that harfbuzz doesn't have anything I can find to easily determine the last codepoint index in a cluster, only the first...

khaledhosny commented 1 year ago

The number of character in a cluster can be clavulated by checking the cluster of the next glyph, see https://harfbuzz.github.io/a-clustering-example-for-levels-0-and-1.html

khaledhosny commented 1 year ago

and https://harfbuzz.github.io/clusters.html

slime73 commented 1 year ago

Yeah, it gets a little complicated in user code when there are multiple fonts and multiple runs of text though, because the next glyph isn't available from an index increment in that case. It would be way simpler (and easier to understand what's going on in general) if harfbuzz provided a start/count or begin/end range instead of just the start, but alas...

khaledhosny commented 1 year ago

It is little complicated, but I don’t see why multiple fonts makes it more so. HarfBuzz works on one run at time and all output values are local to that, the other runs don’t concern the current run.

khaledhosny commented 1 year ago

Providing a range wouldn’t be any easier because the relationship is more complicated than that. When there is decomposition, multiple glyphs can have the same cluster, and with re-ordering you can have a multi-character multi-glyph cluster (glyphs G to G+N1 belong to characters C to C+N2).

slime73 commented 1 year ago

HarfBuzz works on one run at time and all output values are local to that, the other runs don’t concern the current run.

That's not the case in this situation, where you can't know the total number of input unicode values in the last glyph of a run without knowing where the next run starts. Unless I'm missing an API?

khaledhosny commented 1 year ago

That is correct, so the mapping of glyphs to input character indices should ideally be done right after shaping where all the needed information is present.

slime73 commented 1 year ago

Right, this adds complexity to user code where it doesn't seem like it would be strictly necessary if harfbuzz had slight changes in its APIs. Harfbuzz provides an incomplete mapping of glyphs to character indices, so I have to add user code to make the mapping complete.

As you can see, this has already resulted in bugs :(

khaledhosny commented 1 year ago

HarfBuzz is API and ABI stable, so no change is possible (only new APIs), but also there is no better solution. The glyph <-> character relationship can be one to one, one to many, many to one and many to many. The current API can already handle all of this at the expense of slightly complicated user code. Any new API would need to handle all of it as well.

khaledhosny commented 1 year ago
# one to one
G1 G2 G3 G4
C1 C2 C3 C4

# one to many
G1 G2
C1 C4

# many to one
G1 G2 G3 G4
C1 C1 C2 C3

# many to many
G1 G2 G3 G4
C1 C1 C1 C5
slime73 commented 1 year ago

I understand that released code is hard to change, but that doesn't mean an API is the best version of what it can be, just that there's more friction to any improvement.

The current cluster API is confusing and complicated to work with, which ultimately makes for buggier programs. I can spend time solving those bugs in my code, but to me the fact that I have to is a sign that something in the library is worth improving. I can think of a few possible changes to harfbuzz' APIs that would probably work in all your examples, if I think a bit broader than what's there right now. :)

khaledhosny commented 1 year ago

Please forward you suggestions to https://github.com/harfbuzz/harfbuzz/issues