mathjax / MathJax

Beautiful and accessible math in all browsers
http://www.mathjax.org/
Apache License 2.0
10.17k stars 1.16k forks source link

Wrong equation number when typesetting specific elements #2999

Open oanalavinia opened 1 year ago

oanalavinia commented 1 year ago

Issue Summary

My usecase is that I want to typeset only some elements of a page, so for this I used the elements configuration. The problem is that when I have multiple labeled equations defined with \begin{equation} in one given html element, these are wrongly numbered. This is not happening if the given html element contains \begin{eqnarray} with equations.

Steps to Reproduce:

  1. Configure mathjax using the configurations specified below
  2. Add 2 myMathjaxElement elements in you page, one that has multiple \begin{equation} equations, and one that has only one . An example:

    <div class="myMathjaxElement">
    \begin{equation}
    \vec{F} = -F \hat{j} \label{eq1}
    \end{equation}
    
    \begin{equation}
    \vec{F} = -F \hat{j} \label{eq2}
    \end{equation}
    
    \begin{equation}
    \vec{F} = -F \hat{j} \label{eq3}
    \end{equation}
    </div>
    
    <div class="myMathjaxElement">
    \begin{equation}
    \vec{F} = -F \hat{j} \label{eq4}
    \end{equation}    
    </div>
  3. Observe the result

Expected result: equations are correctly numbered Actual result: equations are wrongly numbered, see image:

image

Adittional info: the same will happen if I am dynamically typesetting these myMathjaxElement elements using typesetPromise. Also, as specified in the description, this is not happening if I use \begin{eqnarray} to write my equations

Technical details:

I am using the following MathJax configuration:

MathJax = {
    startup: {
      elements: document.getElementsByClassName("myMathjaxElement"),
    },
    tex: {
      tags: 'ams'
    }
  };

and loading MathJax via

<script async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

An HTML file to reproduce the problem:

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <title>Testing MathJax v3 Equation Numbering</title>
  <script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
  <script>
  MathJax = {
    startup: {
      elements: document.getElementsByClassName("myMathjaxElement"),
    },
    tex: {
      tags: 'ams'
    }
  };
  </script>
  <script async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>
</head>
<body>
  <h1>A test of Equation References</h1>
  <div class="myMathjaxElement">
    \begin{equation}
    \vec{F} = -F \hat{j} \label{eq1}
    \end{equation}

    \begin{equation}
    \vec{F} = -F \hat{j} \label{eq2}
    \end{equation}

    \begin{equation}
    \vec{F} = -F \hat{j} \label{eq3}
    \end{equation}
  </div>

  <div class="myMathjaxElement">
    \begin{equation}
    \vec{F} = -F \hat{j} \label{eq4}
    \end{equation}    
  </div>
</body>
</html>
oanalavinia commented 1 year ago

In case it will be helpfull, I did a small investigation on my side and it looks like the problem comes from here: https://github.com/mathjax/MathJax-src/blob/master/ts/handlers/html/HTMLDocument.ts#L178 It gets the correct linked list of equations from the first mathjax element (i.e. eq1 -> eq2 -> eq3), but when on the second mathjax element, on the merge step, the new equation is merged between eq1 and eq2 and the new list is eq1 -> eq4 -> eq2 -> eq3

vmassol commented 1 year ago

Hello. Could someone confirm that the problem is indeed a MathJax bug? Thanks!

dpvc commented 1 year ago

Yes, it does appear to be, but I haven't been able to investigate it thoroughly yet.

vmassol commented 1 year ago

Thanks @dpvc ! I see you've added the v4.0 milestone. Does it mean you're planning to fix it only for 4.x? We're currently on 3.x and are not against upgrading to 4.x but ATM there's only a v4.0.0 alpha 1 version released and I guess it's a bit risky for us to move to it. Is there any chance this could be fixed in 3.x? :) Thanks again.

dpvc commented 1 year ago

The next official release will be 4.0 (there may be a beta before then), but there won't be a 3.x before that. Updating to 4.0 should be straight-forward (compared to moving from v2 to v3 where the API changed). We haven't decided about what v4 fixes will be back-ported to v3, if any, as there are significant internal changes in v4 that may make such back-ports more complicated to perform.

vmassol commented 1 year ago

ok, thanks @dpvc for the explanations. We're trying to work around the issue with https://github.com/xwiki-contrib/macro-mathjax/pull/2 but this makes us typeset the whole page and could cause other problems (math formulas outside our mathjax macro) so we're trying to assess what we can do to fix the issue in XWiki as we need to perform a release quite soon. I guess there's no timeframe on a final 4.0 yet? So FTM, I don't see any solution for us except applying https://github.com/xwiki-contrib/macro-mathjax/pull/2 and exchanging one bug for another (hoping that the new one is less frequent for our users ;)). Mentioning all this in case you have an idea... :)

dpvc commented 1 year ago

I'll see if I can take a look at it later today. I may be able to provide a patch. In the meantime, a work around might be to disable the initial typesetting, and then use the startup pageReady() function to do something like

MathJax = {
  startup: {
    typeset: false,
    pageReady() {
      let promise = MathJax.startup.defaultPageReady();
      for (const container of document.getElementsByClassName("myMathjaxElement")) {
        promise = promise.then(() => MathJax.typesetPromise([container]));
      };
      return promise;
    }
  }
}

This is untested, but is the right idea. That should make sure the equations are processed in order for now.

dpvc commented 1 year ago

An alternative might be to use the the ignoreHtmlClass and processHtmlClass to control the typesetting. You could use

MathJax = {
  options: {
    ignoreHtmlClass: 'mathjax_ignore',
    processHtmlClass: 'myMathjaxElement'
  }
}

and add class="mathjax_ignore" to the <body> element. That should make MathJax only process the containers that are marked class="myMathjaxElement".

dpvc commented 1 year ago

I've just tested both work-arounds, and they both work on your test page. The second might be the easiest and most efficient one.

vmassol commented 1 year ago

@dpvc wow, thanks a lot, that's very useful! We're trying them now.

dpvc commented 1 year ago

I left out the typeset: false in the first example, but have added it in.

oanalavinia commented 1 year ago

Thanks a lot for the workarounds and quick responses! Both work on our side!

dpvc commented 1 year ago

Great. Glad you are able to get a work around for now.

dpvc commented 1 year ago

I have figured out the source of the issue, and it is in the function that determines when one math item is before another. Since there is no javascript function for determining whether one DOM node is before another, MathJax uses the index in the array of strings that it has culled from the DOM to tell when one expression is before another. But when there are multiple containers being searched, each gets its on list of strings, and so the indices don't properly represent their positions in the DOM relative to other containers.

I've refactored the code so that the strings from all containers are combined into one list, and so the order is maintained, at least for a given typeset pass. So that resolves your situation (though there are potentially other edge cases where there could be problems).

Just for completeness, here is a configuration that makes the changes in the pull request that I've opened for this issue, but I think you are fine to use one of the other simpler solutions instead.

  MathJax = {
    startup: {
      elements: document.getElementsByClassName("myMathjaxElement"),
      ready() {
        const {HTMLDocument} = MathJax._.handlers.html.HTMLDocument;
        const {userOptions} = MathJax._.util.Options;
        Object.assign(HTMLDocument.prototype, {
          findMath(options) {
            if (!this.processed.isSet('findMath')) {
              this.adaptor.document = this.document;
              options = userOptions({elements: this.options.elements || [this.adaptor.body(this.document)]}, options);
              const containers = this.adaptor.getElements(options.elements, this.document);
              for (const jax of this.inputJax) {
                const list = (jax.processStrings ?
                              this.findMathFromStrings(jax, containers) :
                              this.findMathFromDOM(jax, containers));
                this.math.merge(list);
              }
              this.processed.set('findMath');
            }
            return this;
          },
          findMathFromStrings(jax, containers) {
            const strings = [];
            const nodes = [];
            for (const container of containers) {
              const [slist, nlist] = this.domStrings.find(container);
              strings.push(...slist);
              nodes.push(...nlist);
            }
            const list = new this.options.MathList();
            for (const math of jax.findMath(strings)) {
              list.push(this.mathItem(math, jax, nodes));
            }
            return list;
          },
          findMathFromDOM(jax, containers) {
            const items = [];
            for (const container of containers) {
              for (const math of jax.findMath(container)) {
                items.push(new this.options.MathItem(math.math, jax, math.display, math.start, math.end));
              }
            }
            return new this.options.MathList(...items);
          }
        });
        MathJax.startup.defaultReady();
      }
    },
    tex: {
      tags: 'ams'
    }
  };
mflorea commented 8 months ago

Since there is no javascript function for determining whether one DOM node is before another

Can't you determine this using https://developer.mozilla.org/en-US/docs/Web/API/Node/compareDocumentPosition ?