KaTeX / KaTeX

Fast math typesetting for the web.
https://katex.org
MIT License
18.48k stars 1.19k forks source link

MathML output produces unnecessary attributes: scriptlevel="0" and displaystyle = "true" or displaystyle = "false" #3342

Open universemaster opened 3 years ago

universemaster commented 3 years ago

Describe the bug: This follows on from observation of the <mstyle> elements detailed in https://github.com/KaTeX/KaTeX/issues/3333. And more generally follows from https://github.com/KaTeX/KaTeX/issues/2194 about the need to output minimal bytes and DOM elements because of performance issues.

My KaTex source code for my math notes contains only 63 instances of explicit overrides with regard to the default font size or display style.

That is, if I grep the following in my KaTex source I only get 63 results.

\\Huge|\\huge|\\LARGE|\\Large|\\large|\\normalsize|\\small|\\footnotesize|\\scriptsize|\\tiny|\\displaystyle|\\textstyle|\\scriptstyle|\\scriptscriptstyle|\\lim\\limits|\\lim\\nolimits|\\verb

However, KaTeX's mathml output gives:

These are currently attributes of <mstyle> tags. See https://github.com/KaTeX/KaTeX/issues/3333 for some further discussion of the use of <mstyle> at all.

To give an idea of scale, this is from 16759 <math> elements (but many just things like $x$ or $y$), and as I state above from only 63 deliberate changes to scriptlevel or displaystyle.

To Reproduce: Steps to reproduce the behavior:

  1. Write some KaTeX source - with minimal (or no) font size or display style overrides.
  2. Run katex to create some mathml.
  3. Open in a browser (or otherwise) and use developer tools to count instances of scriptlevel or displaystyle.
  4. To delete all the scriptlevel="0" and displaystyle overrides run the following (possibly multiple times) in the console
// Remove all mstyle elements for scriptlevel="0".  This method is hacky, and suboptimal for many reasons. Don't use in production.
let mstyleScriptLevel0Selector =
  //display=block with displaystyle=true is usually redundant unless explicitly overridden in katex source.
  'math[display="block"] mstyle[scriptlevel= "0"][displaystyle = "true"],' +
  //I would never want to explicitly set displaystyle=false inside display=block.
  'math[display="block"] mstyle[scriptlevel= "0"][displaystyle = "false"],' +
  //I would never want to explicitly set displaystyle=false when inline.
  'math:not([display="block"]) mstyle[scriptlevel= "0"][displaystyle = "false"],' +
  // I have added my own stylesheet to set displaystyle=true at stylesheet level, so in effect I never need to override to displaystyle true in the katex source anyway.
  'math:not([display="block"]) mstyle[scriptlevel= "0"][displaystyle = "true"]';

let countOfMstyleScriptLevel0ToRemove = document.querySelectorAll(mstyleScriptLevel0Selector).length;

while (countOfMstyleScriptLevel0ToRemove) {
  document.querySelectorAll(mstyleScriptLevel0Selector).forEach(function (currentNode) {
    currentNode.outerHTML = currentNode.innerHTML
  });
  countOfMstyleScriptLevel0ToRemove = document.querySelectorAll(mstyleScriptLevel0Selector).length;
}
  1. Notice that the rendered mathml is either exactly the same or only different in the cases where you have made deliberate overrides.

Expected behavior:

Environment (please complete the following information):

edemaine commented 3 years ago

Instead of specifying "write some KaTeX source", could you include a small example that demonstrates the problem? This would save time in testing any fix to this (which would presumably involve computing differences in style from the parent, instead of writing out the style every time).

I tried \small x{y{z}} and x_{x{y{z}}} and neither of them generated the attributes you mentioned (on the demo site). On the other hand, the latter example illustrated that the mtight class gets applied to all the HTML groups, which is probably excessive — especially given that there are no CSS rules for .mtight.

universemaster commented 3 years ago

An aside relating to other MathML attributes (which I'll create a new issue for when I get more time to add details):

I also very rarely want to explicitly set operators to not be stretchy, and yet I see "a lot" of operators with stretchy="false".

I've been stripping these out with:

let stretchyRegex = / stretchy="false"/g;
content = await content.replace(stretchyRegex, '')

I don't think this is causing any problems but I'll confirm when I create the full issue (I wanted to write this before I forgot about it).