mathjax / MathJax-src

MathJax source code for version 3 and beyond
https://www.mathjax.org/
Apache License 2.0
2.07k stars 207 forks source link

Rendering with MathML results rendered by \varnothing, the symbol has changed, and is the same as \emptyset #381

Closed YiNiCoo closed 4 years ago

YiNiCoo commented 5 years ago

Rendering with MathML results rendered by \varnothing,the symbol has changed and is the same as \emptyset

dpvc commented 5 years ago

The Unicode character set, used by MathML, only has one position for an empty-set symbol, so both \varnothing and \emptyset produce a reference to the same character. When MathJax renders the two, it uses a different font for each, and so the symbol is different for the two. In MathML output, however, MathJax doesn't control the fonts in use (it doesn't provide the fonts at all for native MathML output), and so it doesn't control the look of the empty-set symbol in that case.

If the distinction is important to you, you could potentially use CSS styling to specify a font for for the \varnothing case. E.g., use \class{varnothing}{\varnothing}, and then add CSS for the .varnothing class that specifies a font to use that has the character in the form you want it. You could make a macro for this to make it easier, or make a TeX pre-filter that replaces \varnothing with \class{varnothing}{\varnothing}. Or you could insert \let\VNTHNG=\varnothing \def\varnothing{\class{varnothing}{\VNTHNG}} at the top of your page to have all the \varnothing commands insert the class automatically.

YiNiCoo commented 5 years ago

Thanks for your solution!

So can I come up with such a recognition that Mathjax doesn't guarantee 100% rendering consistency between the output MathML and the original latex. And then, I need to know the main points that results in inconsistent, so I can fix some of it manually. Could you tell me the main points about it?

dpvc commented 5 years ago

There are a number of situations where the MathML output may differ. In terms of the characters, here are the ones that produce the same Unicode value, but taken from an alternate font in MathJax and the Unicode position for the character generated:

\varnothing             U+2205
\triangledown           U+25BD
\centerdot              U+22C5
\thicksim               U+223C
\thickapprox            U+2248
\smallsmile             U+2323
\shortmid               U+2223
\smallfrown             U+2322
\shortparallel          U+2225
\vartriangle            U+25B3
\nleqslant              U+2A87
\ngeqslant              U+2A88
\nleqq                  U+2270
\ngeqq                  U+2271
\lvertneqq              U+2268
\gvertneqq              U+2269
\nshortmid              U+2224
\nshortparallel         U+2226
\nsubseteqq             U+2288
\nsupseteqq             U+2289
\varsubsetneq           U+228A
\varsupsetneq           U+228B
\varsubsetneqq          U+2ACB
\varsupsetneqq          U+2ACC
\setminus               U+2216

In addition, MathML doesn't have a calligraphic math variant, and so MathJax uses an internal math variant to track that. That is supposed to be converted to mathvariant="script" with a special class so that the browser can style it differently, but that isn't being done in v3 (I have made an issue tracker for that). It may also be that the browser doesn't have fonts for things like Fraktur or black-board bold, so those variants may not display as expected.

There are also spacing differences in the output. MathML spacing rules are different from TeX spacing, and MathJax tries to implement the TeX rules rather than the MathML rules. So if you move to MathML output directly, that can change spacing of some things.

MathJax also has some global controls for things like indentation and alignment of displayed equations. Those won't carry over into MathML output, as they are not part of the MathML itself, in general.

Finally, not all browsers implement all the MathML that MathJax uses internally. For example, Firefox doesn't implement <mlabeledtr>, which MathJax used for equation tags. So numbered equations may not appear with their numbers in Firefox. There have been other differences in the past, but I haven't been following the Firefox MathML developments recently, so don't know what the current set of differences are.

That's what I can think of off the top of my head. There may be other issues that I'm not remembering, but that should give you at least the major issues, I hope.

YiNiCoo commented 5 years ago

Thanks so much for your help!

All your response have provided me with a lot of valuable information, which is very helpful to me.

dpvc commented 4 years ago

I just made a pull request #393 that will preserve the variant-font information in the serialized MathML output, so the results of MathJax MathML output being read back into MathJax should produce the same result as the original TeX after that PR is merged.

YiNiCoo commented 4 years ago

I fund other symbols: \vDash same as \models U+22A8; \varpropto same as \propto U+221D; \hbar same as \hslash U+210F; \Join same as \bowtie U+22C8; \thickapprox same as \approx U+2248; \cdotp same as \cdot U+22C5, wrong point position; \lVert, \rVert same as \Arrowvert U+2016.

check \ncong, the unicode may also be wrong, but the rendering is correct.

dpvc commented 4 years ago

I've added some more commits to #393 in order to address some of these, and additional changes to #395 for others. The MathJax fonts don't include separate characters for \Join and \bowtie. I'm not sure what the complaint is for \lVert, \rVert and \Arrowvert.

You are right that there are some issues with the unicode positions for some macros. These will be straightened out when we rework the fonts in a future release.

YiNiCoo commented 4 years ago

I'm not sure what the complaint is for \lVert, \rVert and \Arrowvert.

On page TeXSyntax.htm, the unicode of \lVert and \rVert is U+2225. So I added them up.

dpvc commented 4 years ago

The page you link to is useful, but not authoritative. The mapping of \lVert and \rVert used to be to U+2225 (parallel to), but that was semantically incorrect and was changed to U+2016 in version 2.7.0. The page author did not update the unicode reference (she was probably unaware of the change). If you use the MathJax contextual menu to view the source as MathML, you will see that the actual character used is U+2016, as it has been in MathJax since 2016 (coincidentally).

YiNiCoo commented 4 years ago

The page you link to is useful, but not authoritative. The mapping of \lVert and \rVert used to be to U+2225 (parallel to), but that was semantically incorrect and was changed to U+2016 in version 2.7.0. The page author did not update the unicode reference (she was probably unaware of the change). If you use the MathJax contextual menu to view the source as MathML, you will see that the actual character used is U+2016, as it has been in MathJax since 2016 (coincidentally).

Oh, I see.