michal-h21 / make4ht

Build system for tex4ht
137 stars 15 forks source link

Outputing weird \mathcal characters #95

Closed yalguzaq closed 1 year ago

yalguzaq commented 1 year ago

Every LaTeX engine (pdflatex) and other processors (MathJax TeX -> SVG) compile the following code

\begin{align*}
  \mathcal{A} \ \mathcal{B} \ \mathcal{C}\\
  \mathcal{D} \ \mathcal{E} \ \mathcal{F}\\
  \mathcal{G} \ \mathcal{H} \ \mathcal{I}
\end{align*}

into something like image make4ht, however, outputs these: image Interestingly, adding 'mathml,mathjax'-option results in something different completely: image

My question is: would it be possible to stick with the classic \mathcal output per default? Is there a .cfg-solution to this problem? My priority is 'mathml'-version of the output.

michal-h21 commented 1 year ago

TeX4ht outputs Unicode values for \mathcal characters, but the problem is that Unicode doesn't declare all possible characters, see the table on this page. It seems that Browser displays them wrongly.

Regarding the MathML output, I get this output with MathJax rendering; with the regular Firefox rendering it displays correctly, even B.

Anyway, the following version of \mathcal seems to work in both MathJax and default rendering:

\Preamble{xhtml}
\catcode`\:=11
\renewcommand\mathcal[1]{\HCode{<\a:mathml mi\Hnewline mathvariant="script">}#1\HCode{</\a:mathml mi>}}
\catcode`\:=12

\begin{document}
\EndPreamble

If it works, I can add something similar to TeX4ht sources.

yalguzaq commented 1 year ago

Unfortunately, this did not solve the problem. Furthermore, the preamble above caused some conflicts, for instance, I got an error with the following code (otherwise it works though):

If $X = \left\{a,b,c\right\}$, let
\begin{gather*}
\mathcal{T}_{1} = \left\{\varnothing, X, \left\{a\right\}, \left\{a,b\right\}\right\} \hspace{1cm} \text{and} \hspace{1cm} \mathcal{T}_{2} = \left\{\varnothing, X, \left\{a\right\}, \left\{b,c\right\}\right\} \,.
\end{gather*}
Find the smallest topology containing $\mathcal{T}_{1}$ and $\mathcal{T}_{2}$, and the largest topology contained in $\mathcal{T}_{1}$ and $\mathcal{T}_{2}$.  

It shows image

michal-h21 commented 1 year ago

It seems that we need to add extra group around it:

\Preamble{xhtml}
\catcode`\:=11
\renewcommand\mathcal[1]{\bgroup\HCode{<\a:mathml mi\Hnewline mathvariant="script">}#1\HCode{</\a:mathml mi>}\egroup}
\catcode`\:=12

\begin{document}
\EndPreamble
yalguzaq commented 1 year ago

The fix resolved the 'Math input error' but I am still getting outputs like image via 'mathml,mathjax'-option.

This did not happen when I was using the tex-to-chtml converter by MathJAX, so are you sure that this is a browser issue? I tried this on several machines, the signs look weird on all of them.

michal-h21 commented 1 year ago

I think this is an MathJax issue, that they use a strange font for mathvariant="script". If you display it just in Firefox without MathJax, it looks how it should look according to MDN.

I've toke a look at the MathML code generated by MathJax, and it seems that it uses mathvariant="script" too, but in addition, they also use data-mjx-variant="-tex-calligraphic". So you can try this:

\Preamble{xhtml}
\catcode`\:=11
\renewcommand\mathcal[1]{\bgroup\HCode{<\a:mathml mi\Hnewline  data-mjx-variant="-tex-calligraphic" mathvariant="script">}#1\HCode{</\a:mathml mi>}\egroup}
\catcode`\:=12
\begin{document}
\EndPreamble
yalguzaq commented 1 year ago

Yes, I can now confirm that this solution works.

I think it makes sense to make this behaviour default in the next make4ht version.

michal-h21 commented 1 year ago

The problem is that it is a proprietary MathJax attribute, and it could cause issues in other outputs that use MathML too (ODT, JATS, for example). It seems weird why MathJax doesn't use the correct font by default.