mathjax / MathJax

Beautiful and accessible math in all browsers
http://www.mathjax.org/
Apache License 2.0
10.16k stars 1.16k forks source link

Chinese should be convert to <mi> not <mo> #2501

Closed adamma1024 closed 3 years ago

adamma1024 commented 4 years ago

mo tag only use for oprators, chinese should use mi tag

const result = Mathjax.tex2mmlSync('中文');
console.log(result) 
// <math xmlns="http://www.w3.org/1998/Math/MathML"><mo>&#x4E2D;&#x6587;</mo></math>
// it should be 
// <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>&#x4E2D;&#x6587;</mi></math>
dpvc commented 4 years ago

You should probably use \text{中文} for this, in any case.

MathJax's TeX input jax puts any character that doesn't have another definition, and isn't a letter (a to z, upper or lower case) or a digit (0 to 9), and puts it into an <mo>. It is true that this is not correct for some languages. The pattern that is used for letters (that are placed into <mi> elements) could be modified to include other characters.

For example, you could use

<script>
MathJax = {
  startup: {
    ready() {
      MathJax.startup.defaultReady();
      MathJax._.input.tex.MapHandler.MapHandler.getMap('letter')._regExp = /[a-z\u4E20-\u9FFF]/i;
    }
  }
};
</script>

to make the the CJK Unified Ideographs block also map to <mi> elements. Note, however, that each character will be put into a separate <mi>, not all into one. As, I said, you are better off using \text{} for multi-character words.

dpvc commented 4 years ago

I'm moving this to the main MathJax repository.

adamma1024 commented 3 years ago

thank u very much! Btw, what should I do to put chinese or other characters into tag ?

dpvc commented 3 years ago

I'm not sure I understand the question. First, are you using LaTeX or MathML for your input format. Your original question uses TeX, so I assume that is what you are using. But in that case, you simply enter the characters as you did in the example you gave. Use unicode characters for the ones that you want. I would recommend using \text{中文} if you are entering words rather than variables in your formulas. You can also use \unicode{xE42D} to enter a character by its unicode code point. You can also use and HTML entity like &#xE42D; if you are authoring in HTML (if you are using Markdown or something similar, this may not come through properly).

dpvc commented 3 years ago

Also note that version 3.2 is more nuanced about the MathML tag that it uses for different ranges of unicode characters. So Chinese characters are now in <mi> automatically.