mathjax / MathJax

Beautiful and accessible math in all browsers
http://www.mathjax.org/
Apache License 2.0
10.11k stars 1.16k forks source link

Two-Character Unicode Support #2672

Open queejie opened 3 years ago

queejie commented 3 years ago

Is your feature request related to a problem? Please describe. I'm almost certain one cannot render two-character unicode symbols using any tags supported in MathJax. Some sample codes are those found for national flags, here. In every case where the double-character is rendered, it is rendered as two separate elements. E.g., image

Describe the solution you'd like It would be great if \verb, or \unicode or some other TeX tag could render the double character. E.g., 🇦🇩, or U+1F1E6 U+1F1E9.

Describe alternatives you've considered I am building a browser based formula editor based on MathJax, and it uses TeX as a basis for constructing elements. I had hoped there was a TeX command that might work, but after two days of reading and experimentation, I am unable to find anything.

Additional context The output format used is SVG, and the TeX is converted to MathMl using MathJax.

Thank you!

pkra commented 3 years ago

I think https://github.com/mathjax/MathJax/issues/2595 and https://github.com/mathjax/MathJax-src/pull/676 should help.

dqjauthentrics commented 3 years ago

Thank you very much! Most of the discussion was over my head, but if I got the gist I can now use \symbol{🇦🇩, }, and that this capability is in a new version that is not part of a CDN distribution. Do I have that right? If so, would I need to follow the instructions in the manual for "Hosting Your Own Copy", using the version in the pull request?

dpvc commented 3 years ago

if I got the gist I can now use \symbol{🇦🇩, }, and that this capability is in a new version that is not part of a CDN distribution.

Not quite. There is no \symbol macro, but there are ones for the different variant forms, like \symbf for bold, \symsfit for sans-serif-italic, and so on. The reason Peter pointed to this is that these macros also will group multiple characters into a single MathML element, whereas each character normally is put into its own MathML element. So ab is usually translated as <mi>a</mi><mi>b</mi>, but \symbf{ab} would produce <mi mathvariant="bold">ab</mi> with both letters in the same <mi>.

Unfortunately, that does not solve the problem. The multi-character flags seem to be handled as ligatures, which means the two character have to be next to each other in the same DOM node in the final output. When MathJax encounters a character that is not in its math fonts (like these), it places them in separate <mjx-utext> elements (for CHTML output) or <text> elements (for SVG output), and so they are not next to each other in the DOM, and are not elided.

Fortunately, there is a way to get what you want. That is to use \text{🇦🇩} and to configure MathJax to use either the surrounding font for text elements (via the mtextInheritFont: true property), or to use a specific font for text elements (via mtextFont: 'Times' for example). When one of these properties is set, MathJax will render <mtext> MathML elements as a single DOM node, and that will allow the two characters to be combined by the browser. So

MathJax = {
  svg: {
    mtextInheritFont: true
  }
}

together with \text{🇦🇩} in the TeX input should provide the result you are looking for in SVG output (change svg to chtml if you are using CommonHTML output).

The current behavior of putting each unknown character into its own DOM element could probably be improved to combine them into a single one, in which case Peter's suggestion would have worked. But that's not the case right now.

As for v3.1.3, it will be released either later today or tomorrow, so no need to install your own.

queejie commented 3 years ago

Thanks to both of you!

Sent from my iPhone

On Apr 22, 2021, at 12:24 PM, Davide P. Cervone @.***> wrote:



if I got the gist I can now use \symbol{🇦🇩, }, and that this capability is in a new version that is not part of a CDN distribution.

Not quite. There is no \symbol macro, but there are ones for the different variant forms, like \symbf for bold, \symsfit for sans-serif-italic, and so on. The reason Peter pointed to this is that these macros also will group multiple characters into a single MathML element, whereas each character normally is put into its own MathML element. So ab is usually translated as ab, but \symbf{ab} would produce ab with both letters in the same .

Unfortunately, that does not solve the problem. The multi-character flags seem to be handled as ligatures, which means the two character have to be next to each other in the same DOM node in the final output. When MathJax encounters a character that is not in its math fonts (like these), it places them in separate elements (for CHTML output) or elements (for SVG output), and so they are not next to each other in the DOM, and are not elided.

Fortunately, there is a way to get what you want. That is to use \text{🇦🇩} and to configure MathJax to use either the surrounding font for text elements (via the mtextInheritFont: true property), or to use a specific font for text elements (via mtextFont: 'Times' for example). When one of these properties is set, MathJax will render MathML elements as a single DOM node, and that will allow the two characters to be combined by the browser. So

MathJax = { chtml: { mtextInheritFont: true } }

together with \text{🇦🇩} in the TeX input should provide the result you are looking for in CommonHTML output (change chtml to svg if you are using SVG output).

The current behavior of putting each unknown character into its own DOM element could probably be improved to combine them into a single one, in which case Peter's suggestion would have worked. But that's not the case right now.

As for v3.1.3, it will be release either later today or tomorrow, so no need to install your own.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/mathjax/MathJax/issues/2672#issuecomment-824988397, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAMKXO5CBAI6GGNWHLMG2CLTKBEJHANCNFSM43LE2OUA.

dqjauthentrics commented 3 years ago

That worked great for MathMl, but unfortunately not for saving as SVG. (See screenshot, below).

This is my MathJax setup:

MathJax = { chtml: { mtextFont: 'arial unicode ms', // mtextInheritFont: true, }, svg: { mtextFont: 'arial unicode ms', // mtextInheritFont: true, }, loader: { load: [ 'input/tex-full', 'input/mml', 'output/svg', 'ui/menu', 'ui/safe', 'a11y/semantic-enrich', 'a11y/complexity', 'a11y/explorer', 'a11y/assistive-mml', '[tex]/ams', '[tex]/braket', '[tex]/boldsymbol', '[tex]/require', '[tex]/html', '[tex]/unicode', '[tex]/verb' ] }, };

On Apr 22, 2021, at 1:13 PM, David Quinn-Jacobs @.***> wrote:

Thanks to both of you!

Sent from my iPhone

On Apr 22, 2021, at 12:24 PM, Davide P. Cervone @.***> wrote:



if I got the gist I can now use \symbol{🇦🇩, }, and that this capability is in a new version that is not part of a CDN distribution.

Not quite. There is no \symbol macro, but there are ones for the different variant forms, like \symbf for bold, \symsfit for sans-serif-italic, and so on. The reason Peter pointed to this is that these macros also will group multiple characters into a single MathML element, whereas each character normally is put into its own MathML element. So ab is usually translated as ab, but \symbf{ab} would produce ab with both letters in the same .

Unfortunately, that does not solve the problem. The multi-character flags seem to be handled as ligatures, which means the two character have to be next to each other in the same DOM node in the final output. When MathJax encounters a character that is not in its math fonts (like these), it places them in separate elements (for CHTML output) or elements (for SVG output), and so they are not next to each other in the DOM, and are not elided.

Fortunately, there is a way to get what you want. That is to use \text{🇦🇩} and to configure MathJax to use either the surrounding font for text elements (via the mtextInheritFont: true property), or to use a specific font for text elements (via mtextFont: 'Times' for example). When one of these properties is set, MathJax will render MathML elements as a single DOM node, and that will allow the two characters to be combined by the browser. So

MathJax = { chtml: { mtextInheritFont: true } }

together with \text{🇦🇩} in the TeX input should provide the result you are looking for in CommonHTML output (change chtml to svg if you are using SVG output).

The current behavior of putting each unknown character into its own DOM element could probably be improved to combine them into a single one, in which case Peter's suggestion would have worked. But that's not the case right now.

As for v3.1.3, it will be release either later today or tomorrow, so no need to install your own.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/mathjax/MathJax/issues/2672#issuecomment-824988397, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAMKXO5CBAI6GGNWHLMG2CLTKBEJHANCNFSM43LE2OUA. — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mathjax/MathJax/issues/2672#issuecomment-825036190, or unsubscribe https://github.com/notifications/unsubscribe-auth/AS2JKH4NKQYBGVLQY5CYHMDTKBKLVANCNFSM43LE2OUA.

dqjauthentrics commented 3 years ago

Oops! My bad. Setting mtextInheritFont: true worked for SVG as well. Please consider this closed.

dpvc commented 3 years ago

Your original configuration with mtextFont also works for me.

BTW, there was not attached screen image. Also, you don't need to load the [tex]/... packages explicitly because they are already included in input/tex-full.

I'm going to leave this open for now, to remind me to look into whether combining adjacent unknown characters into the same container element would be feasible.

dqjauthentrics commented 3 years ago

Thank you, @dpvc. My image was in an email reply, so it probably didn't make it. Here it is in github: image