mathjax / MathJax

Beautiful and accessible math in all browsers
http://www.mathjax.org/
Apache License 2.0
10.12k stars 1.16k forks source link

[v4 alpha] TeX parsing failure with nested underbraces #2979

Open pkra opened 1 year ago

pkra commented 1 year ago

The following works in v3 (e.g., mathjax.org#demo; also in "real" TeX) but fails in v4-alpha

\underbrace{\int\underbrace{a}_{}\underbrace{b(c,d)e}_{}}_{}

With an error for a wrong number of children for munder node.

pkra commented 1 year ago

This was the best I could do in terms of a reduced test case. Here's the original

\begin{equation}
0\le u_0(t)=\lambda \underbrace{\int _0^1\underbrace{\big (M(\rho )\big )^{-1}}_{<0}\underbrace{G(t,s)f\big (s,u_0(s)\big )}_{>0}\ ds}_{<0}<0,
\end{equation}
dpvc commented 1 year ago

Here is a somewhat more reduced version

\int a\underbrace{(b)c}
dpvc commented 1 year ago

It has something to do with SRE, as it renders properly when not enriched.

pkra commented 1 year ago

Thanks for looking into it @dpvc.

Here's a possibly related one (mover error)

\begin{equation*} \langle F_s, g\rangle _{L^2(\mu )} = \int _{S^{n-1}} \widehat{\mu f}(s\omega ) \overline{\widehat{\mu g}(s\omega )}\,d\omega = s^{1-n} \langle \widehat{\mu f}, \widehat{\mu g}\rangle _{L^2(sS^{n-1})}. \end{equation*}
pkra commented 1 year ago

Reduced \int\widehat{\mu f}(s)

dpvc commented 1 year ago

I suspect this is the same issue. Volker and I chatted about this in our meeting this morning, and it looks like it is SRE's handling of accents, so he will look into it.

pkra commented 1 year ago

Thanks, Davide.

zorkow commented 1 year ago

I also get a bizarre rendering for \int\widehat{\mu f}(s) without enrichment in the latest version of MathJax. Somehow that \widehat goes through the integral sign.

The last time it seems to be working fine is in commit e9933f974d56ecc2179bb206fb8f9876d3501314

dpvc commented 1 year ago

@zorkow, there is an issue (that I'm already aware of) with some combining characters. In the past, fonts handled these by making zero-width characters that overlapped their bounding box on the left so that the character would appear above the preceding one. Modern browsers, however, will position these characters themselves, even when not zero width (they handle capital and lower case letters that way, for example). So many new fonts don't make the combining characters zero width any more, but the browser renders them as though they are automatically. MathJax doesn't yet take that new behavior into account, so when a character in the font has non-zero width, MathJax assumes the browser will use that width, which they no longer do for combining characters.

I need to take that into account, but didn't have time to work that out before 4.0-alpha. It is on my list of things to fix. There is a complication that not all browsers do this (notably Safari and WebKit), so I need a solution that works for both cases (where the browser makes a character zero width even though the metrics say it isn't, and when the browser uses the given width). I had worked out a scheme for that some months ago, so I hope I still remember how that works. :-)

In any case, you can ignore that for now.

dpvc commented 1 year ago

@zorkow: PS, the commit you mention is the one just before the new default font was merged. The old MathJax TeX fonts used zero-width combining characters, while some of the ones in the MathJax-Modern font are not zero-width, so that is why the issue shows up in that case. MathJax is assuming theses characters are non zero-width, as the font metrics indicate, but the browser is rendering them as zero-width, so they shift to the left compared to where MathJax thinks they are, and so they are out of place.

pkra commented 1 year ago

Here's likely another example, this time with overline.

\begin{multline} \bigg |\int \limits _{\mathbb{R}^{d-1}}\Big [f*\mathcal{F}_{\zeta \mapsto z}\big [e^{2\pi i x_dh(\zeta )}\psi (\zeta )\big ]\Big ](z)\cdot \overline{\Big [g*\mathcal{F}_{\zeta \mapsto z}\big [e^{2\pi i y_dh(\zeta )}\psi (\zeta )\big ]\Big ](z)}\cdot |z|^{-2\gamma }\,dz\bigg |\\ \lesssim (1+|x_d|)^{-\gamma }(1+|y_d|)^{-\gamma }\|f\|_{L_{\frac{2d-2}{d-1+2\gamma }}}\|g\|_{L_{\frac{2d-2}{d-1+2\gamma }}}. \cssId{AlmostDone}{\tag{59}} \end{multline}
zorkow commented 1 year ago

As mentioned to @dpvc it's a single line fix, strengthening one guard. The problem actually came from improved detection of composed functions. But if they are not functions but just elided multiplications, they were "attacked" by the method that tries to combine those.

pkra commented 1 year ago

Thanks, Volker.

pkra commented 1 year ago

I'm not sure if this is the same problem but since these all seemed related to integrals, it might be:

\int _{b} \operatorname {\mathrm{a}{b}}_{c}

full test case:

\begin{eqnarray*} \frac{1}{2} \operatorname {Vol}(\mathcal{M}_{1,1}) &=& \int _{\mathcal{M}_{1,1}} \sum _\alpha \frac{1}{1+e^{\ell _X(\alpha )}} \operatorname {\mathrm{d}{Vol}}_{WP} \\ &=& \int _{\mathcal{M}_{1,1}^*} \frac{1}{1+e^{\ell _X(\alpha )}} \operatorname {\mathrm{d}{Vol}}_{WP}. \end{eqnarray*}
pkra commented 1 year ago

I'm not sure if this is the same problem but since these all seemed related to integrals, it might be:

\int _{b} \operatorname {\mathrm{a}{b}}_{c}

full test case:

\begin{eqnarray*} \frac{1}{2} \operatorname {Vol}(\mathcal{M}_{1,1}) &=& \int _{\mathcal{M}_{1,1}} \sum _\alpha \frac{1}{1+e^{\ell _X(\alpha )}} \operatorname {\mathrm{d}{Vol}}_{WP} \\ &=& \int _{\mathcal{M}_{1,1}^*} \frac{1}{1+e^{\ell _X(\alpha )}} \operatorname {\mathrm{d}{Vol}}_{WP}. \end{eqnarray*}

This gives an msub error.

pkra commented 1 year ago

On second thought, this looks familiar. Sorry if I already filed this somewhere.

dpvc commented 1 year ago

Shorter test case: \int {{a}b}_{c}

I don't recognize this, so I don't think it was reported previously.

pkra commented 1 year ago

Here's what I think is another variation B \int \underbrace{\Pi ^m \phi }_{{}<0} -- which renders the <0 part as if it followed the underbrace expression (and if I remove the B, I'm back to munder issues).

dpvc commented 1 year ago

Slightly reduced: a \int \underbrace{bc}_{d}

Removing the a gives an invalid munder element, with the a the d follows the underbraced element.

dpvc commented 1 year ago

@zorkow, I'm not sure if this is the same issue that we discussed last week.

zorkow commented 1 year ago

That's exactly the same issue. It does not happen in the pkra-issue branch. I've made a PR and will merge as soon as I've written some tests.

What happens is that the role of the element bound by the underbrace is propagated (in this case an implicit multiplication). This can make sense, e.g., if you have a function composition, the term could be composed with another function. But since the guard for implicit multiplication was too weak (i.e. only acted on the role rather also checking that the whole expression is indeed an operation) any heuristic that tries to unpack elements in elided multiplications can wreak havoc. In this case the heuristic that tries to determine the integral variable illegally unpacks the underbrace. Eg. if you have $\int a b d x r $ SRE would find the dx.

On Fri, 23 Dec 2022 at 16:16, Davide P. Cervone @.***> wrote:

@zorkow https://github.com/zorkow, I'm not sure if this is the same issue that we discussed last week.

— Reply to this email directly, view it on GitHub https://github.com/mathjax/MathJax/issues/2979#issuecomment-1364034067, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLTZX64DJJBDQNVUFE7IKTWOW65DANCNFSM6AAAAAAS7YIW5U . You are receiving this because you were mentioned.Message ID: @.***>