w3c / mathml-core

MathML Core draft
https://w3c.github.io/mathml-core
36 stars 14 forks source link

Ink ascent/descent #78

Open fred-wang opened 5 years ago

fred-wang commented 5 years ago

This one is a bit tricky... It was mentioned in w3c/mathml-core#139 and https://github.com/mathml-refresh/mathml/issues/52#issuecomment-514016715 at least. But it needs clear explanation.

When drawing a box, one can distinguish its normal size (the space it occupies) from its "ink" size (smallest enclosing rect for the painted content). For example, a white space has nonzero width/height but an empty ink box.

First the ink size is involved in the OpenType MATH spec when computing vertical gaps [1]. If I'm correct, it is not involved for horizontal placement during layout of math box, so I think we can don't need "ink left/right bearings" and can focus on ink ascent/descent below.

Using CSS baseline concepts, the MathML Core specification defines two new baselines "ink-over baseline" and "ink-under baseline" which delimitate the ink ascent/descent of a box [2]. All the boxes/baselines in the SVG schemas of the spec are drawn with dashes but the ink boxes are suggested with a background.

The status in browser is the following:

In order to understand this properly, we need a non-trial test case. I uploaded a test case [4] with basic text/fraction/square root examples and the interesting one which is AAA/√CCC (square root as a denominator).

The font is built like this [5]:

Exercise: Deduce the vertical positions of AAA, fraction bar, radical overbar and CCC.

Gecko, WebKit, Blink, XeLaTeX and LuaLaTeX render this last example inconsistently. Analysis is complicated by the fact that there are some unrelated bugs and that I was not able to make XeLaTeX/LuaLaTeX use my custom glyphs for the text.

Two questions for the CG:

(1) Is it important to have this concept of ink ascent/descent for all boxes? Or only male ascent/descent of token elements match the text ink ascent/descent? As I understand the MATH spec, it is important to have it for all boxes...

(2) If it is important, is the MathML Core description correct? For example, is the "radical vertical gap" included in its ink box or not? I guess we should ask Microsoft people about this.

[1] https://docs.microsoft.com/en-us/typography/opentype/spec/math [2] https://mathml-refresh.github.io/mathml-core/#box-model [3] https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/css/mathml.css#L19 [4] https://github.com/mathml-refresh/ink-metrics [5] https://github.com/mathml-refresh/ink-metrics/blob/master/create-font.py

fred-wang commented 5 years ago

@SergeyMalkin Do you know how Microsoft Word would render something equivalent to https://github.com/mathml-refresh/ink-metrics/blob/master/index.tex#L53 or https://github.com/mathml-refresh/ink-metrics/blob/master/index.html#L105 with the font https://github.com/mathml-refresh/ink-metrics/blob/master/ink-ascent-descent-test.otf ? Can you explain how the different fraction / radical parameters are interpreted?

@khaledhosny @jfkthame Do you know why XeLaTeX (and LuaLatex) fails to use my custom glyph for A, B, C in https://github.com/mathml-refresh/ink-metrics/blob/master/index.tex ? Are you able to explain the rendering of https://github.com/mathml-refresh/ink-metrics/blob/master/xelatex.pdf (1.9 specifically).

Thanks!

fred-wang commented 5 years ago

From left to right: Gecko, WebKit, Blink, XeLaTeX, LuaLaTeX (as I said I was not able to use my custom glyph in the latex document...)

mfrac-sqrt

fred-wang commented 5 years ago

Gap between the fraction bar and radical bar seems to be 3em (radical ascender + denominator gap) in Webkit, Blink and LuaLatex. It is only 1em (denominator gap) in Gecko since the radical ascender is not considered part of the msqrt ink box. I don't understand XeLaTeX's result, the gap seems about 3.5em but I don't understand why.

khaledhosny commented 5 years ago

@khaledhosny @jfkthame Do you know why XeLaTeX (and LuaLatex) fails to use my custom glyph for A, B, C in https://github.com/mathml-refresh/ink-metrics/blob/master/index.tex ?

They need to be in math ltalic slots (LaTeX by default using ASCII in math mode as shorthand for math italic) or switch math alphabet to uptright roman (\mathup I think).

khaledhosny commented 5 years ago

switch math alphabet to upright roman (\mathup I think).

\symup actually, \mathup witches to the text font (don’t ask me why).

fred-wang commented 5 years ago

@khaledhosny Thanks. I had tried \mathup but that didn't help..

Here is the updated result (xelatex: left, lualatex: right). So it seems xelatex uses a 2em gap (radical ascender) > 1em gap (denominator gap).

mfrac-sqrt-xelatex-lualatex

fred-wang commented 5 years ago

Actually xelatex's behavior is weird, even after setting all the gaps/shifts to 0, I still have a 1em gap between the two bars. RadicalExtraAscender does not seem to have any effect.

SergeyMalkin commented 5 years ago

This is how Word renders this formula: MathRadicalDenominatorTest

I can't immediately explain it, though. Will need a bit more time to give you exact calculations (correct or not) that produced this result.

Thanks, Sergey

fred-wang commented 5 years ago

Sergey

Thanks, this looks consistent with WebKit, Chromium and LuaLaTeX, except that I don't understand why the "radical vertical gap is" between the blue and radical bar is not 1em. Looking forward to getting more information.

fred-wang commented 4 years ago

We discussed this during the two previous meetings and Murray mentioned that Microsoft Word does not distinguish ink VS non-ink boundaries for radicals.

Consensus seems to be that this difference is not fundamental and only Gecko does it. So probably we can remove it and keep the simple form: only use ink extents when calculating the metrics of text-only (only text content without forced line break or soft wrap opportunity) token elements.

fred-wang commented 4 years ago

I've committed some changes to remove the distinction between and non-ink metrics for most of the elements (including radical elements mentioned above).

There are some things that I'm not sure.

First, I wonder if we still need to keep a distinction for token elements (rather than claiming the metrics are just the ink ones). For example for MINUS SIGN (U+2212) <mo>−</mo> whose ink bounding box is typically a thing horizontal bar above the baseline and the "ink descent" is negative. I think negative values still work well with the definition of baseline / alignment. However, the background won't be visible if ink and non-ink boxes match. This is same for any rectangle glyph, which is probably why Gecko keeps the distinction.

If we keep the distinction, I believe this has other consequences:

Additionally, a nonzero border increases ink size while nonzero margin/padding don't. This means that margin/padding can be used to force different ink and non-ink metrics for any element. This is a bit similar to mpadded but for that element the spec currently always says the ink and non-ink metrics match (even if you add some lspace for example).

fred-wang commented 2 years ago

To elaborate a bit more concretely, this is how browsers render the following example:

<p><math><mtext style="background: pink">−</mtext></math></p>
<p><math><mtext style="background: lightgreen">_</mtext></math></p>
<p><math><mtext style="background: lightblue">_−</mtext></math></p>

Gecko calculate both "normal" and ink metrics for all MathML boxes, so it's possible to properly positioned math stuff with ink metrics, while still using "normal" metrics for the background.

WebKit and Blink with WIP patch make normal and ink metrics match for token elements, so background may not be visible (or be very close the ink extent of the text).

Blink without the patch uses only normal metrics, so the background remains large but there are too many gaps in MathML formulas.

mtext-with-minus-and-underscore

davidcarlisle commented 2 years ago

@fred-wang thanks for the nice explanation and image...

Looking just at your image in the last comment the Firefox one looks best but if implementation wise it adds a lot of complication it may be complication we might live without. Certainly in TeX usage people often don't like boxes that too closely mirror individual character height/depth as you end up with very ragged highlighting, and so you often end up with things like \colorbox{\strut .} \colorbox{\strut A}where the\strut` forces any content within a reasonable size range to have a minimal height and depth, so the boxed . and boxed A have the same vertical extents. If end users are explicitly hiding the character height to get even sized background boxes, then the finer points of how the default box size is obtained from the metrics may (??) not be so important.

That said the webkit one that is (from somewhere?) always showing a small part of the background seems to be harder to explain but may cause less confused end users than the blink+patch version, I can see many people asking where their backgrounds have gone. But it is I think explainable that the background is matching (and obscured by) the character,

fred-wang commented 2 years ago

For completeness, this is how Blink currently renders https://people.igalia.com/fwang/gamma.html:

upstream

and how it renders it when using ink metrics for token elements:

upstream-with-CLs

Notice that the former contains large vertical gaps, which is definitely not desired. I believe fixing this is more important than having not-very-visible background issue for rectangle glyphs, so I will probably do that for an initial implementation anyway.

This is normally what WebKit does too, but would need to check the details of the implementation to be sure: https://webkit-search.igalia.com/webkit/rev/c0c0524829fd2bebfa47a7ffad327558fd5f2672/Source/WebCore/css/mathml.css#46 https://webkit-search.igalia.com/webkit/rev/c0c0524829fd2bebfa47a7ffad327558fd5f2672/Source/WebCore/rendering/mathml/RenderMathMLToken.cpp#609 In any case for WebKit, the logical metrics and ink metrics match, so if it shows a small part of the background, that also means it adds slightly more gaps in math formulas, which we probably don't want. Gecko does not have this problem since it stores both metrics for each MathML box (which is what the core spec also does for now) and even has more complicated rules as discussed above.

davidcarlisle commented 2 years ago

@fred-wang the second image is definitely much better and if that causes a few features about background colouring that need to be carefully explained I'd still say it looks like a win. The blink rendering is really starting to look nice. I'd support your initial implementation plan here.

clapierre commented 2 years ago
<p><math><mtext style="background: pink">−</mtext></math></p>
<p><math><mtext style="background: lightgreen">_</mtext></math></p>
<p><math><mtext style="background: lightblue">_−</mtext></math></p>

Be careful, it is not recommended to have inline styling as it can not be overridden by a person who may be color blind or need to adjust these colors, font sizes etc.

could the same thing be done instead with CSS? IE:

<p><math><mtext class="background_pink">−</mtext></math></p>
<p><math><mtext class="background_lightgreen">_</mtext></math></p>
<p><math><mtext class="background_lightblue">_−</mtext></math></p>

[CSS]
.background_pink {background-color: pink;}
.background_lightgreen {background-color: lightgreen; }
.background_lightblue {background-color: lightblue;}
fred-wang commented 2 years ago

@clapierre yes sure, putting CSS in a separate style sheet is valid in MathML. I only used inline style attribute to make the example less verbose.

dginev commented 1 year ago

re Fred's comment https://github.com/w3c/mathml-core/issues/78#issuecomment-1056874643

Notice that the former contains large vertical gaps, which is definitely not desired.

I've stumbled on the flip side of that issue, where there is no gap at all, in the root case. Fred cross-referenced this issue in my Chromium report, so back-referencing that report here: https://bugs.chromium.org/p/chromium/issues/detail?id=1395834

I'll inline the image for convenience:

radical signs holding content without top padding

I'm nowhere near informed enough to make any useful technical comments on how MathML Core should be improved here.

Jamesernator commented 1 year ago

I'm not sure if all of these problems are due to this cause, but it looks like the same issue. Basically pretty much all fonts (including the default) have weird behaviours with <msqrt> in the Chromium implementation:

Top and bottom of sqrts are not aligned:

sqrt-underlines-and-overlines

Large gaps under sqrt are common:

sqrt-gaps-2

The radical fails to line up with the overbar extender:

sqrt-overbar-misalign

Both MathJax and Firefox (I have not checked Safari) have far more reasonable behaviour with their respective default fonts:

MathJax:

sqrt-mathjax

Firefox:

sqrt-firefox

(Live demo of these issues, though you will need the fonts installed locally)