mathjax / MathJax

Beautiful and accessible math in all browsers
http://www.mathjax.org/
Apache License 2.0
10.21k stars 1.16k forks source link

MathJax 3 renders unicode combining characters as separate glyphs #3041

Open benchristel opened 1 year ago

benchristel commented 1 year ago

Issue Summary

Unicode combining characters in \text do not combine with the preceding character as they should; instead they render as separate glyphs with circle placeholders. This is a problem for scripts that rely on combining characters, e.g. Bengali (shown below).

Screen Shot 2023-04-27 at 8 43 29 AM

MathJax version: 3.2.2, using the NodeJS APIs in the mathjax-full package. Browser: Microsoft Edge on macOS 12.6

Steps to Reproduce:

On https://www.mathjax.org/#demo, enter the text $\text{অঙ্ক}$ in the box. Expected: no circle placeholders; output text renders the same as input Actual: circle placeholder appears; output looks different from input

Technical details:

MathJax version: 3.2.2, using the NodeJS APIs in the mathjax-full package. Client OS: macOS 12.6 Browser: Microsoft Edge

Supporting Information

This bug report looks related: https://github.com/mathjax/MathJax/issues/2672

Workarounds:

To work around this problem, I am running the following post-processing code on the DOM returned by MathJax. It merges adjacent mjx-utext nodes into one, which fixes this problem by allowing the browser's text-rendering engine to do its usual thing.

function mergeUnicodeTextNodes(dom) {
    for (let mtext of dom.querySelectorAll("mjx-mtext")) {
        let head = null;
        const toRemove = [];
        for (let child of mtext.childNodes) {
            if (child.tagName === "MJX-UTEXT") {
                if (head == null) {
                    head = child
                } else {
                    head.firstChild.textContent += child.firstChild.textContent;
                    toRemove.push(child)
                }
            } else {
                head = null;
            }
        }
        for (let child of toRemove) {
            mtext.removeChild(child)
        }
    }

    return dom
}
dpvc commented 1 year ago

This is resolved in version 4, currently in alpha release, with an expected beta release in the next two weeks. If you don't want to try alpha or beta versions, then the current work-around is to have mtext elements inherit the surrounding text font using

MathJax = {
  chtml: {
    mtextInheritFont: true
  },
  svg: {
    mtextInheritFont: true
  }
}

as recommended in #2672. The update to CHTML output for this is in mathjax/MathJax-src#734 in commit 115c54, and for SVG it is mathjax/MathJax-src#903.