mathjax / MathJax

Beautiful and accessible math in all browsers
http://www.mathjax.org/
Apache License 2.0
10.18k stars 1.16k forks source link

MathJax does not support Unicode 14 mathematical script variants #3045

Open JeppeKlitgaard opened 1 year ago

JeppeKlitgaard commented 1 year ago

I much prefer writing the majority of my source math using Unicode, which generally works really well with MathJax.

I have noticed, however, when I want to display calligraphic math, I find that they render using an unexpected font in MathJax.

My suggestion is thus that the Unicode range starting at 1D49C render similar to \mathcal{…}

As an example, the insert the following on https://www.mathjax.org/#demo:

$$
\mathcal{O} \quad\quad 𝒪
$$

Which renders as image

I would expect them to have the same rendering

pkra commented 1 year ago

𝒪 matches \mathscr{O}, not \mathcal{O}

image

(There's a unicode proposal for distinguishing the calligraphic/chancery alphabet from the fancy-script/roundhand alphabet using variation selectors though I'm not sure where that's at.)

JeppeKlitgaard commented 1 year ago

Ah, now that makes a lot of sense! Thanks a ton @pkra.

I think with that in mind I will close this issue.

From https://github.com/w3c/mathml/issues/61 and https://w3c.github.io/xml-entities/script.html it appears that the Unicode variant has been added as part of Unicode 14!

JeppeKlitgaard commented 1 year ago

It appears that the Unicode variant selectors are not supported by MathJax just yet :(

Here is a demo that can be used on https://www.mathjax.org/#demo:

Calligraphic Variants: 𝒪︀ (text), $𝒪︀$ (MathJax, Unicode), $\mathcal{O}$ (MathJax, LaTeX)
<br>
Script Variants: 𝒪︁ (text), $𝒪︁$ (MathJax, Unicode), $\mathscr{O}$ (MathJax, LaTeX)
<br>
No Variant Selected: 𝒪 (text), $𝒪$ (MathJax, Unicode)

Which renders as:

image

The variants are not rendered correctly when given as Unicode. Note that the browser also does not correctly render the variants, presumably because of a lack of font support. Since MathJax has access to glyphs for both variants, it should be able to render the variants correctly.

Expected:

xworld21 commented 1 year ago

I'd like to add that other symbols have variation sequences. The one I noticed is the symbol for the empty set U+2205, which the Unicode spec renders like \varnothing, but with an extra 0+FE00 makes it like \emptyset. (By the way MathJax 3 renders U+2205 like \emptyset.)

xworld21 commented 1 year ago

I have jerry-rigged partial support for some variation sequences by adding some filters after initialisation but before typesetting: https://github.com/vlmantova/bookml/blob/694457da6c22b530b24c488a6bca89112bcb0a7f/XSLT/bookml-html5.xsl#L101-L169 The code modifies the MathML nodes before they get loaded so that MathJax uses the desired variant. It understands script variants and make U+2205 behave as per Unicode 15.1 (so render like \varnothing by default, switch \emptyset if followed by U+FE00). The main limitation is that it only works for identifiers and operators consisting of a single character only.

The full list of standardized variation sequences is at https://www.unicode.org/Public/15.1.0/ucd/StandardizedVariants.txt

If I understand correctly, variation sequences should really be implemented in the fonts and enabled via font-feature-settings: "cv01", "cv02" in CSS (not sure if MathJax needs code changes, maybe around font metrics?).

xworld21 commented 12 months ago

I repackaged my code: https://codepen.io/xworld21/pen/oNmbeVR contains a MathJax 3 config that recognises the variation sequences in MathML input (not in TeX input I am afraid).