w3c / mathml-core

MathML Core draft
https://w3c.github.io/mathml-core
36 stars 14 forks source link

Interpretation of DisplayOperatorMinHeight #126

Open fred-wang opened 5 years ago

fred-wang commented 5 years ago

Just converting comment from the appendix

It is not clear how the DisplayOperatorMinHeight is supposed to be used or whether it is really reliable. More specifically, integral symbols (e.g. INTEGRAL U+222B) are typically taller than N-ary operators (e.g. N-ARY SUMMATION U+2211) so a unique minimal height may not always be enough to determine the display size. Suppose for example that the sum has three size variants: the base size of height 1em, the display size of height 2em and a bigger variant of height 3em. Suppose that the integral has three sizes: a base size of height 1em, a larger size variant of height 2em and a display size of height 3em. If DisplayOperatorMinHeight is less than 3em then it does not force the display size of the integral to be selected. If it is more than 3em then none of the available sizes satisfies the condition. How to interpret that? Should we pick the largest as a fallback? If it is 3em, the desired size will be selected for the integral in display size but the one selected for the sum in display size will be too large. A heuristic is proposed in section 3.2.3 to workaround limitations of DisplayOperatorMinHeight.

fred-wang commented 4 years ago

This is the Firefox discussion: https://bugzilla.mozilla.org/show_bug.cgi?id=407059#c8

fred-wang commented 4 years ago

This is the current text from core:

Issue 95 Clarify/improve this? How important is this MATH parameter? "integrals" is not defined Use the MathVariants table to try and find a glyph of height at least DisplayOperatorMinHeight If none is found, fallback to the largest non-base glyph. If none is found, fallback to the layout algorithm of § 3.2.1.1 Layout of . Because this parameter does not always give the best size, user agents may also use the following heuristic: ensure that the large variant height is at least 2 times as large as the base height for integrals and √2 times as large as the base height for other operators."

We discussed this during the two previous core meetings, but no clear conclusion so far.

fred-wang commented 4 years ago

I'm removing the "Because this parameter does not always give the best size, user agents may also use the following heuristic: ensure that the large variant height is at least 2 times as large as the base height for integrals and √2 times as large as the base height for other operators. " from the spec text for now.

faceless2 commented 4 years ago

That paragraph was useful for us when we were writing our implementation and trying to match the rendering from other user-agents. l'm sure you have your reasons for removing it, but if it were up to me I'd suggest perhaps keeping it as an explanatory note rather than removing it altogether.

fred-wang commented 4 years ago

@faceless2 I'm trying to cleanup the spec. I agree this parameter from the OpenType MATH spec is really weird but rather than introducing hack in the spec (and implement such a thing in Chromium) to "try to match the rendering from other user-agents", I'd prefer to get clarification from Microsoft first. Last time it was discussed with Murray in a MathML Core meeting, there was not any clear conclusion.

fred-wang commented 1 year ago

Chromium issue with Cambria Math On Windows: https://bugs.chromium.org/p/chromium/issues/detail?id=1408038

davidcarlisle commented 1 year ago

@fred-wang is this related to the vertical bars having wrong height in row 24 of your test file, just if Cambria is used (I suspect not)? Or should there be a new issue for that?

image

fred-wang commented 1 year ago

@davidcarlisle no, this is is only for largeop. I also saw that other issue with the bar, but I haven't opened it yet and don't have any clue so far.

fred-wang commented 1 year ago

@davidcarlisle The other issue is https://bugs.chromium.org/p/chromium/issues/detail?id=1409380

fred-wang commented 1 year ago

As I mentioned 3 weeks ago, we are already getting report about this bug so let's discussed this again. Per the MATH spec, displayOperatorMinHeight is defined as follows:

Minimum height of n-ary operators (such as integral and summation) for formulas in display mode (that is, appearing as standalone page elements, not embedded inline within text).

In MathML Core, this is used in layout of operators to select a glyph of height at least DisplayOperatorMinHeight.

Note that the spec probably meant to take the smallest size with that constraint satisfied. Note that it also gives the opportunity to still find a larger variant if none is found, but this does not seem to be implemented in Chrome.

I dumped below the relevant part of the MATH table from Cambria Math. DisplayOperatorMinHeight is 2500 so if we take the minimum size satisfying this constraint, we obtain glyph summation.vsize2 and integral.vsize1 for summation and integral characters respectively. The former is ok but the latter is too small (according to user report). Note that other existing math fonts show up similar issues but Cambria Math is the one distributed by Microsoft.

Khaled Hosny commented 9 years ago:

That is a known issue in both fonts, but MS implementation does not seem to use DisplayOperatorMinHeight so the issue does not show there, either that or we are not correctly understanding what this constant is for. XeTeX implementation has the following workaround:

h1:=get_ot_math_constant(cur_f,displayOperatorMinHeight); if h1<(height(p)+depth(p))5/4 then h1:=(height(p)+depth(p))5/4;

‘p’ is the base glyph (the starting glyph) of the operator, so if DisplayOperatorMinHeight < 5/4 its height and depth, set it to that value.

Instead of 5/4, WebKit uses the factor √2:

https://searchfox.org/wubkat/source/Source/WebCore/rendering/mathml/MathOperator.cpp#246

Firefox uses factor √2 in general, but 2 for integrals (which requires to introduce a list of characters considered "integrals"):

https://searchfox.org/wubkat/rev/3de3ef30cefada3014150b0591598345dc7458e3/Source/WebCore/rendering/mathml/MathOperator.cpp#247

Another option is that the "Height" means the "ink ascent" of the glyph rather than the "ink height" as supported by the following naming convention for math constants:

Height — Specifies a distance from the main baseline.

If so, that would make the algorithm slower because we can't just read the advanceMeasurement value and have to measure the glyph. In any case, trying to measure the ink heights with FontForge tools I get approximately integral.vsize1: ~1960, integral.vsize2: ~2790, summation.vsize1: ~1740, summation.vsize2: ~2540 so displayOperatorMinHeight=2500 would select the desired vsize2 in both cases.

In any case, it would be good to get clarification from Microsoft about how it is implemented in Microsoft Office before proposing an edition to the spec.

  <MathConstants>
    <DisplayOperatorMinHeight value="2500"/>
  </MathConstants>

  <MathVariants>
   <!-- summation -->
   <VertGlyphConstruction index="17">
      <!-- VariantCount=5 -->
      <MathGlyphVariantRecord index="0">
        <VariantGlyph value="summation"/>
        <AdvanceMeasurement value="1862"/>
      </MathGlyphVariantRecord>
      <MathGlyphVariantRecord index="1">
        <VariantGlyph value="summation.vsize1"/>
        <AdvanceMeasurement value="2312"/>
      </MathGlyphVariantRecord>
      <MathGlyphVariantRecord index="2">
        <VariantGlyph value="summation.vsize2"/>
        <AdvanceMeasurement value="3911"/>
      </MathGlyphVariantRecord>
      <MathGlyphVariantRecord index="3">
        <VariantGlyph value="summation.vsize3"/>
        <AdvanceMeasurement value="5911"/>
      </MathGlyphVariantRecord>
      <MathGlyphVariantRecord index="4">
        <VariantGlyph value="summation.vsize4"/>
        <AdvanceMeasurement value="8590"/>
      </MathGlyphVariantRecord>
    </VertGlyphConstruction>

    <!-- integral -->
    <VertGlyphConstruction index="21">
        <GlyphAssembly>
          <ItalicsCorrection>
            <Value value="1265"/>
          </ItalicsCorrection>
          <!-- PartCount=3 -->
          <PartRecords index="0">
            <glyph value="integralbt"/>
            <StartConnectorLength value="0"/>
            <EndConnectorLength value="200"/>
            <FullAdvance value="2934"/>
            <PartFlags value="0"/>
          </PartRecords>
          <PartRecords index="1">
            <glyph value="uni23AE"/>
            <StartConnectorLength value="1100"/>
            <EndConnectorLength value="1100"/>
            <FullAdvance value="2501"/>
            <PartFlags value="1"/>
          </PartRecords>
          <PartRecords index="2">
            <glyph value="integraltp"/>
            <StartConnectorLength value="200"/>
            <EndConnectorLength value="0"/>
            <FullAdvance value="2922"/>
            <PartFlags value="0"/>
          </PartRecords>
        </GlyphAssembly>
        <!-- VariantCount=5 -->
        <MathGlyphVariantRecord index="0">
          <VariantGlyph value="integral"/>
          <AdvanceMeasurement value="2218"/>
        </MathGlyphVariantRecord>
        <MathGlyphVariantRecord index="1">
          <VariantGlyph value="integral.vsize1"/>
          <AdvanceMeasurement value="2769"/>
        </MathGlyphVariantRecord>
        <MathGlyphVariantRecord index="2">
          <VariantGlyph value="integral.vsize2"/>
          <AdvanceMeasurement value="4406"/>
        </MathGlyphVariantRecord>
        <MathGlyphVariantRecord index="3">
          <VariantGlyph value="integral.vsize3"/>
          <AdvanceMeasurement value="6583"/>
        </MathGlyphVariantRecord>
        <MathGlyphVariantRecord index="4">
          <VariantGlyph value="integral.vsize4"/>
          <AdvanceMeasurement value="9460"/>
        </MathGlyphVariantRecord>
      </VertGlyphConstruction>
   </MathVariants>
fred-wang commented 1 year ago

For the record, there is a thread on www-math too:

https://lists.w3.org/Archives/Public/www-math/2023Feb/0020.html https://lists.w3.org/Archives/Public/www-math/2023Feb/0022.html https://lists.w3.org/Archives/Public/www-math/2023Feb/0023.html https://lists.w3.org/Archives/Public/www-math/2023Feb/0024.html

fred-wang commented 1 year ago

Checking again the thread from March, the information provided by Microsoft is still not enough to decide how to specify the use of DisplayOperatorMinHeight. I didn't get any reply to https://lists.w3.org/Archives/Public/www-math/2023Feb/0023.html

khaledhosny commented 8 months ago

I did some experiments and I think this is essentially a bug in MS implementation. The DisplayOperatorMinHeight is not used at all (whatever value it takes makes no difference, even 0). What is being used instead is DelimitedSubFormulaMinHeight.

I tested by having integrals of different heights and changing DisplayOperatorMinHeight repeatedly to match either the full height or the ascender of the desired glyph, but nothing changed. Doing the same with DelimitedSubFormulaMinHeight, however, always resulted in the desired glyph being used based on the full height of the glyph.

khaledhosny commented 8 months ago

This seems to be consistent with the glyph sizes in Cambria Math. The DelimitedSubFormulaMinHeight is 3000 while the second integral and summation glyph height are 2768 and 2311 respectively, so they are not selected and the third glyphs whose heights are 4405 and 3910 are selected.

khaledhosny commented 7 months ago

I submitted this for OpenType spec https://github.com/MicrosoftDocs/typography-issues/issues/1136

fred-wang commented 6 months ago

@khaledhosny Thanks Khaled! So it looks like MathML Core's implementation of displayOperatorMinHeight is correct and this issue can be closed... however engines will probably have to keep some hack around for backward compatibility...

I wonder if such a hack should be done in HarfBuzz though, maybe https://harfbuzz.github.io/harfbuzz-hb-ot-math.html#hb-ot-math-get-constant should invert DisplayOperatorMinHeight and DelimitedSubFormulaMinHeight for some fonts that are known to have the bug (e.g. Cambria Math up to a version)?

khaledhosny commented 6 months ago

I wonder if such a hack should be done in HarfBuzz though, maybe https://harfbuzz.github.io/harfbuzz-hb-ot-math.html#hb-ot-math-get-constant should invert DisplayOperatorMinHeight and DelimitedSubFormulaMinHeight for some fonts that are known to have the bug (e.g. Cambria Math up to a version)?

Probably. Please open an issue in HarfBuzz tracker to see what others think about it.