whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
7.91k stars 2.58k forks source link

editorial issues with Canvas TextMetrics #4073

Closed dbaron closed 4 years ago

dbaron commented 5 years ago

This is filed based on a subset of the comments in https://github.com/w3ctag/design-reviews/issues/302#issuecomment-422999075 about the definitions of TextMetrics. In particular:

  • there are multiple references to a "line box", yet I don't see anything that says what this line box is. Do these mean "inline box" instead?
  • all of the "positive numbers indicating that the given baseline is (below/above) the X" seem confusing to me, and seem like they'd be clearer if they were swapped to "positive numbers indicating that X is (above/below) the given baseline" (note that the order is swapped and the above/below are swapped so that the statement is still true). That's probably just preference, though. (But it does make them more consistent with the preceding 4 definitions.)
  • the "half the font size if the given baseline is the middle of that em square" comment only seems true if there's only a single font or if the em squares of all fonts used line up (at least assuming my previous assumption was correct)
yiyix commented 5 years ago

We are trying to re-launch TextMetrics now.

there are multiple references to a "line box", yet I don't see anything that says what this line box is. Do these mean "inline box" instead?

No, line box is different from inline box. Line box captures the entire line: https://drafts.csswg.org/css2/visuren.html#line-box we have put the definition links on "line box" in the spec. if it still looks confusing, I can add some definition to make it more clear.

the "half the font size if the given baseline is the middle of that em square" comment only seems true if there's only a single font or if the em squares of all fonts used line up (at least assuming my previous assumption was correct)

Yes, you are correct. I will update the spec and include the phrase "of all the fonts used to render the text" with "half the font size if the given baseline is the middle of that em square" .

annevk commented 5 years ago

cc @whatwg/canvas

yiyix commented 5 years ago

as updated in https://github.com/whatwg/html/issues/4071, the last comment about "half the font size if given baseline is the middle of that em square" will be update to "half the font size if given baseline is in the middle of em square of the font specified to use to render text"

dbaron commented 5 years ago

Is the "line box" referring to the one created in step (5) of the text preparation algorithm. (I missed that last time; it's a bit remote from this text.) If so, it should probably be clearer that that's the one.

dbaron commented 5 years ago

Actually, that's still not clear, since nothing says what CSS properties the block containing that line box has, and that affects a number of these results. In particular, the current spec text says that the given font is set on the inline box. If the line box acts as though it's in a block with the default font, that would produce substantially different (and probably unexpected) results relative to the behavior you'd get if the line box acts as though it's in a block with the given font specified.

yiyix commented 5 years ago

The definition of the line box is better specified here on CSS2. That's why we think it is properly defined.

I'm not sure I understand what you mean by the block containing that line box. According to the spec, the line box contains the inline box that has the properties, not the other way around. The line box is not inside anything.

dbaron commented 5 years ago

In CSS, line boxes are always contained inside a block box (their containing block). The rules on line height calculation explain that the block influences the height of the line box through the creation of a strut (see the first paragraph under line-height). The CSS model doesn't have a concept of line boxes existing in isolation; they're always within a block. The font and line-height styles of that block affect the height of the line, and thus affect the results here.

fserb commented 5 years ago

dbaron, it could be the case that the whatwg spec is not describing what it wants to. But let's try to untangle this:

  1. I understand that the line box is contained by the strut, but the line box has (according to step 5 of the text preparation) "all the properties at their initial values except the 'font' property of the inline box set to font, the 'direction' property of the inline box set to direction, and the 'white-space' property set to 'pre'.". Therefore the 'vertical-align' of the line box is always "baseline".

  2. For the height of the container box, we have this definition for non-replaced inline elements that doesn't specify the height, but suggests it. Is this the problem you are referring to? If so, we can specify on canvas that the containing block must follow the same rule that the UA choses for this. Would that address the issue you are trying to highlight?

  3. If that's not what you are bringing attention to, are you actually talking about the fact that the 'containing box' spec linked above is at fault for not defining other CSS properities of the containing box? If so, is this an issue for CSSWG?

dbaron commented 5 years ago

Replies to those points:

  1. line boxes don't have properties, and step 5 of the text preparation rules says those things are true about the inline box, not the line box. A line box doesn't have a vertical-align since a line box doesn't have properties.

  2. not what I'm referring to, and not in any way involved in the definitions of emHeightAscent, emHeightDescent, hangingBaseline, alphabeticBaseline, and ideographicBaseline, since the height of the inline doesn't affect any of the things that those definitions refer to.

  3. no

The problem here is that you're introducing new definitions that depend on the characteristics of the line box. In order to do that, you need to change the canvas text preparation algorithm so that it assigns the font properties specified for the canvas to the block containing the line box. You should also fix the incidental problem already existing in the canvas text preparation algorithm so that it describes the block that contains the line box. This makes the fix for your problem easy since you can then just assign the properties to that block.

fserb commented 5 years ago

Thanks for sticking around. I'm sorry for asking more questions while I'm trying to get this right.

  1. I see. The text is funny on the spec "Form a hypothetical infinitely-wide CSS line box containing a single inline box containing ..." the first containing is about the line box, the second about the inline box. Fair.

  2. Okey. Let me try again to see if I get what you are saying.

For example, for emHeightAscent: The distance from the horizontal line indicated by the textBaseline attribute to the highest top of the em squares in the line box; positive numbers indicating that the given baseline is below the top of that em square.

There are two numbers involved here. The "top of the em square in the line box" and the "given baseline". And we are, hopefully, returning one in relation to the other. In that sense, I don't think the absolute values will matter, so I don't see how any property of the line box should matter.

I can see how claiming "the em square in the line box" as being a concept is a problem. Maybe that's close to what you are saying? I.e., what it should actually say is "the top of the em square of the selected font"?

I'm not sure I get the problem with the text preparation algorithm, but we can leave that aside for now.

dbaron commented 5 years ago

The two numbers for emHeightAscent are, quoting the spec:

Note that there are multiple em squares in the line box. They can vary in position when multiple fonts are used to render the text. For example, if you're drawing the text "China (中国)" with a font list where the first choice font doesn't have glyphs for the characters 中 or 国 then those two characters will be drawn in a different font whose em square has a different position. However, wording like "highest top of the em squares in the line box" would generally, in CSS, also include the metrics of the "strut" mentioned above. (Note that this ambiguity about whether the strut is counted only affects whether emHeightAscent and emHeightDescent are problematic here.) The strut uses the font metrics of the block that contains the line box, which might be substantially larger than the font specified (and used for the inline box), so the result given by the text in the spec might actually be unrelated to the font specified and the text given.

For the baseline properties this is more clearly problematic since I don't need to infer that the strut is included among the things counted for "highest top"; the baselines of the line box are unambiguously derived from the characteristics of the block that contains the line box.

(And again, I think all of this could also be avoided by just specifying that the inline box is used rather than the line box.)

fserb commented 5 years ago

yeah, the baseline properties are more fundamentally broken. I see what you mean now.

I'm fine with specifying inline box for emHeight instead of line. Thanks for taking the time to explain it.

fserb commented 5 years ago

dbaron, one extra question that came up here:

we now have fontBoundingBoxAscent/Descent and emHeightAscent/Descent. I think what was originally intended by this spec is that one of those would be only for the selected font, and the other for all used fonts. But it seems that if we change the definition to "inline box", this won't be the case.

I was thinking we should specify that emHeight is for the top/bottom of all the em boxes of fonts used and 'fontBoundingBox' to be of the selected font only. Unless there's another definition of 'fontBoundingBox' that I'm missing somewhere. WDYT?

dbaron commented 5 years ago

There's a clear distinction between fontBoundingBoxAscent and emHeightAscent even for a single font; the font bounding box includes the vertical extents of all the glyphs in the font, whereas the em-height is just a design space, exactly the height of the font size, but some glyphs generally overflow it. So that wasn't at all how I saw the distinction.

(Unfortunately I've been unable to find a good diagram showing this; the ones I've found either don't distinguish between font metrics and glyph metrics clearly or focus on the latter rather than the former, or they're just wrong in major ways...)

dbaron commented 5 years ago

(there's a nice diagram in https://github.com/opentypejs/opentype.js/issues/367#issue-403535747)

fserb commented 5 years ago

This diagram is also on WhatWG, but I don't see how it helps that difference. We already have actualBoundingBox. The way I understood what you said is that fontBoundingBox is the highest of all the glyphs of the font (not only the used glyphs), right?

Anyway. I see the difference between fontBoundingBox and emHeight.

One issue is that the current implementation on Safari and Chrome return the emHeightAscent of the specified font. We could change the spec to that. No?

dbaron commented 5 years ago

actualBoundingBox is a function of the glyphs that are used; you'd get a smaller height for the text "o" then the text "d", and you'd get an even larger height for "È". fontBoundingBoxAscent would be from the relevant baseline to "top of bounding box" in that diagram, regardless of the glyphs. emHeightAscent would be from the relevant baseline to the "top of em square" in that diagram, also regardless of the glyphs. (Although not quite regardless of the glyphs, since you'd consider all the fonts needed to draw the glyphs given, but you wouldn't consider the glyphs beyond their influence on the set of fonts.)

At least that's my interpretation...

fserb commented 5 years ago

It's still not clear what's the difference between actualBoundingBox and fontBoundingBoxAscent. What is the "top of bounding box regardless of the glyphs"? I agree with the rest.

yiyix commented 5 years ago

let me try to understand this.

I think you mean the following: actualBoundingBoxAscent/Descent is from baseline to "the top of glyphys" that are used. fontBoundingBoxAscent is from baseline to "top of bounding box" in that diagram, regardless of the glyphs. And "top of bounding box regardless of the glyphs" meaning of all the fonts used to render the text?

I also have trouble to understand the difference between the top of glyphs and the top of the bounding box. Could you explain it more? Thank you.

fserb commented 4 years ago

Trying to sum up the discussion we had:

dbaron commented 4 years ago
  • We want to make fontBoundingBox and emHeight to explicitly talk about the currently selected font (similar to what CSS does)

Rather than "currently selected font", I think the best term to use is first available font, which is what the font relative length units reference.

  • Update the definition of fontBoundingBox* to include an implementation note that should follow the proper OpenFont table (I don't remember which one).

I think what @jfkthame suggested was the sTypoAscender and sTypoDescender from the OS/2 table.

jfkthame commented 4 years ago

I think what @jfkthame suggested was the sTypoAscender and sTypoDescender from the OS/2 table.

Just to confirm - yes, I think that's the most useful option.

Perhaps also with a note that if no OS/2 table is present, it should fall back to the ascender and descender from the hhea table -- historically, Apple at least used to ship fonts with no OS/2 table. (Note that OS/2 is not included in Table 2: The required tables in the Apple TrueType spec, although it is required by the Microsoft OpenType spec.)