latex3 / unicode-math

XeLaTeX/LuaLaTeX package for using unicode/OpenType maths fonts
http://ctan.org/pkg/unicode-math
LaTeX Project Public License v1.3c
239 stars 28 forks source link

Loading the unicode-math package breaks mhchem math #459

Open krishnakumarg1984 opened 6 years ago

krishnakumarg1984 commented 6 years ago

Description

Loading unicode-math breaks the math-mode typesetting of chemical formulae provided by the mhchem package. Link to the stackexchange discussion (provided below) gives a more detailed description.

Check/indicate

Minimal example demonstrating the issue

%! TEX program = lualatex
    \documentclass[varwidth=true, border=10pt, convert={size=640x}]{standalone}
    \usepackage[version=4,arrows=pgf]{mhchem}
    \usepackage{unicode-math}
    \setmathfont{Libertinus Math} % doesn't matter. All unicode math font behaves the same way
    %\setmathfont{texgyrepagella-math.otf} % doesn't matter. All unicode math font behaves the same way

    \begin{document}

    \begin{center}
    \ce{A_x <=> B + y C} \\
    \end{center}

    \begin{align}
        \ce{A_x &<=> B + y C} \\
        \ce{X + Y &-> Z}
    \end{align}

    \begin{align}
        \ce{A_$x$ &<=> B + $y$ C} \\
        \ce{X + Y &-> Z}
    \end{align}

    \end{document}

Further details

u-fischer commented 6 years ago

The problem is that in math mhchem surrounds everything with \mathrm. And this change the \Umathcode.

\documentclass{article}
\usepackage[version=4,arrows=pgf]{mhchem}
\usepackage{unicode-math}

\begin{document}

$\ce{A_{\symit{x}}}$

\makeatletter
\def\mhchem@hook@beforeItalicMath{\symit}

$ \ce{A_x} \ce{A_{xxxx}} $

 %not a solution but only to demonstrate that removing the \mathrm avoids the problem:
\def\mhchem@option@mathFont{}%{\mathrm}

${\ce{A_x}}$

\end{document} 

Imho it would be very useful to have a official way command to "break out " of a \mathrm as this is used internally by packages (see also https://github.com/wspr/unicode-math/issues/438).

krishnakumarg1984 commented 6 years ago

So, is this a bug with mhchem and not with unicode-math?

u-fischer commented 6 years ago

Well difficult to say. The problem is that this here gives a different output if unicode-math is loaded:

\documentclass{article}
\usepackage{amsmath}
%\usepackage{unicode-math}
\begin{document}
$\mathrm{\text{\ensuremath{x}}}$
\end{document}

without unicode math: image

with unicode-math: image

So the question is if mhchem can rely on the first output or not, and if unicode-math should ensure it or not.

krishnakumarg1984 commented 6 years ago

Unlike mhchem, this issue does not affect the chemformula package (when it is used along with unicode-math). So, I am unsure who is to blame.

mhchem commented 6 years ago

@krishnakumarg1984 What is your reasoning why $\mathrm{\text{\ensuremath{x}}}$ should render differently when unicode-math is loaded?

davidcarlisle commented 6 years ago

@mhchem, a variant of @u-fischer 's example removes amsmath showing it's really a unicode-math issue.

\documentclass{article}

\ifx\Uchar\undefined\else
\usepackage{unicode-math}
\fi
\begin{document}

$\mathrm{\mbox{$abc$}}$

\end{document}

this gives roman in unicode-math and italic with pdftex

u-fischer commented 6 years ago

@davidcarlisle But I wonder if unicode-math can really avoid it. I mean it would have to reset the Umathchar codes at every math to make sure to nullify some \mathXX command.

davidcarlisle commented 6 years ago

@u-fischer not sure it seems to me that setting \fam should be enough.

u-fischer commented 6 years ago

@davidcarlisle I don't think that \fam1 as in the tex.sx answer works. It actually uses the cmmi-fonts:

\documentclass{article}

\usepackage{unicode-math}

\setmathfont{TeX Gyre Termes Math}
\begin{document}
$A_x$

$\mathrm{\mbox{\ensuremath{\fam1 A_x}}}$

$\mathrm{\mbox{\ensuremath{\csname__um_switchto_normal:\endcsname A_x}}}$

\end{document}

image

davidcarlisle commented 6 years ago

@u-fischer I guess you are right, perhaps

$\mathrm{\mbox{\ensuremath{\mathnormal{A_x}}}}$

or higher level way to get unicode-math to put things back in nested contexts

wspr commented 6 years ago

There is so much setting and reseting of mathcodes that I get a bit lost sometimes :)

The reason using \fam doesn’t work is that the mathcodes for ascii need to be set to plane 1 for regular maths to work. It looks here, though, like I should perhaps patch \mbox to call \mathnormal if it’s already inside maths mode?

davidcarlisle commented 6 years ago

@wspr patching \mbox wouldn't work in most cases eg ams\text (which is actually what gets used here) use \hbox in \mathchoice, you could patch \everymath but that seems a bit costly as well. as nested math in a \mathxxx is probably rare, an alternative plan would be to document that unlike the classic setup the setting is inherited by a nested math construct (which to be honest is I would guess what most people expect from \mathrm{\hbox{$x$}} and then @mhchem could use \ensuremath{\mathnormal{... to normalize the nested math constructs back to math italic.

perhaps..

davidcarlisle commented 6 years ago

which means that my \fam=-1 answer at stackexchange is wrong so I'll delete (mentioning here for those who can't see deleted tex.sx posts)

u-fischer commented 6 years ago

I also think that resetting everything all the time would be too costly and \mathnormal looks like the "official way to break out of \mathrm" I asked about above.

But beside this: the use of \mathrm feels wrong in this context. I'm not quite sure what the code really wants to achieve. But if it wants actually the upright math style, a local version of normal-style=upright, then \symrm would be the better command. And if it wants some parts in \mathrm then it should not put \mathrm around the whole argument but only around these parts.

Btw: perhaps it would be useful to have "switch" variants of the sym-commands. E.g.

\documentclass{article}
\usepackage{unicode-math}
\begin{document}
\ExplSyntaxOn
\newcommand\symupswitch{\tl_set:Nn \l__um_mathstyle_tl {up}\__um_switchto_up: \__um_mathgroup_set:n {-1}}
\ExplSyntaxOff
$\symup{Abc} Abc {\symupswitch Abc} Abc $
\end{document}

But as I don't understand what exactly \__um_mathgroup_set:n {-1} is doing here, I'm not sure if it doesn't have side effects.

krishnakumarg1984 commented 6 years ago

Can the Unicode-math team officially announce using chemformula as a viable workaround, until @mhchem and unicode-math team are able to figure out a solution? I am a bit surprised this hasn't been reported so far. Aren't there any chemistry folks using Unicode fonts?

josephwright commented 6 years ago

@krishnakumarg1984 I'm a chemist but (1) use chemformula and (2) don't use unicode-math (the latter isn't really needed for most synthetic chemistry)

u-fischer commented 6 years ago

@krishnakumarg1984 I don't understand your problem. The work-around I suggested at the tex.sx question should work fine.

\makeatletter
\def\mhchem@hook@beforeItalicMath{\symit}
krishnakumarg1984 commented 6 years ago

@u-fischer I am sorry if I misunderstood, but I thought that the solution you proposed worked only in certain cases, i.e. your answer said "But it doesn't work if the argument doesn't consist of a single char".

Secondly, such kind of low-level kernel-hacks by inspecting document stack-trace defeats the purpose of the mhchem package. I did a bit of reading on these packages. The developer of the chemformula package once wrote a tugboat article, in which Martin Hensel, the author of mhchem package summarised these two packages. I am paraphrasing them as "mhchem is more hands-off. There is very little customisability, but the focus is on ease-of-use. chemformula on the other hand gives the user a lot more control and customisability. "

For an electrical engineer like me, looking to describe a few chemical equations in my thesis and move on, mhchem does not work straightaway when used with modern font technologies which is a let-down. Sure, you could hack up a workaround or two. I am not even questioning whether your code works (since I am well aware of your TeX-related prowess), but I am certainly questioning whether some other new user looking to use the package can immediately find this bug report or that Stack-exchange post, and apply the required workaround.

On the other hand, chemformula does work as advertised. As a non-expert, you read the package documentation, and try the examples and voila, things work. Until things work like what it it says on the mhchem package documentation without such low-level hacks, it might help those using unicode-math, if the unicode-math team perhaps have a footnote declaring incompatibilities with mhchem in the unicode-math PDF documentation. This could help a lot of people.

u-fischer commented 6 years ago

My solution doesn't work for more than one char as mhchem actively sets another font in such cases, this is independant from unicode-math. It is not the task of unicode-math to maintain long lists about problematic package -- it is the documentation of mhchem which should mention this.

Beside this: your bug report is one day old and you already have a work-around, people analyzed the problem and also the maintainer of mhchem got involved in the discussion, so be a bit more patient.

krishnakumarg1984 commented 6 years ago

@u-fischer Sorry about that. I have anyway moved on using the workaround in my thesis. Sorry for the noise.

My heartfelt thanks to you, David Carlisle, Will Robertson, Joseph Wright and mhchem for actually taking a deep look and analysing the issue. Never did I expect the bug to get such attention from TeX heavyweights .... :)

mhchem commented 6 years ago

@u-fischer I tried to update mhchem in a way so that it works with unicode-math and without. At first, \mathnormal seemed very promising, but then I found the reason why I did not use it. The numbers render as old-style figures. So, \mathnormal{...} does not really escape \mathrm. \text{\ensuremath{...}} does, but it does not work with unicode-math. Do you have any more ideas?

u-fischer commented 6 years ago

@mhchem I would need an example document to play around. And could you say which fonts \ce{A_x} should use when used inside math? I do find it a bit odd that you switch to math mode, isn't a chemical notation done in the text font?

mhchem commented 6 years ago

@u-fischer mhchem uses the text font when called from a text environment. And it uses the (upright) math font, when called from a math environment. For layout (superscript and subscript) it uses the math mode (and switches to \mathrm and \text respectively.

However, not all things are upright in a chemical formula. Some things are mathematics like a subscripted variable x. For this, I need a way to switch from \mathrm to "normal" math. \mathnormaldoes not work properly. I found that \text{\ensuremath{...}} works well, but this is exactly what brakes with unicode-math.

So, I am looking for an alternative for $\mathrm{upright\text{\ensuremath{x-1}}upright}$. I thought, there would be a generic solution and already asked at https://tex.stackexchange.com/q/428999/73371. (The first answers indicate that there is no generic solution that works without unicode-math-specific code.)

u-fischer commented 6 years ago

I did understand what mhchem is doing now. But does it switch to mathrm because of a logical reason, or only "to get the correct output"? As an example lets assume that the text font, mathrm and the math font are really different looking fonts (lmodern, Arial and Cambria in the example below). Which output do you want to get in such a case?

\documentclass{article}
\usepackage{mhchem}
\usepackage{unicode-math}
\setmathfont{Cambria Math}
\setmathfontface\mathrm{Arial}
\begin{document}
some Text \ce{A_x} 

\( \symup{A}_x \quad \mathrm{A}_x \quad \text{\ce{A_x}} \) 

\sffamily some sans serif Text \ce{A_x} 

\( \symup{A}_x \quad \mathrm{A}_x \quad \text{\ce{A_x}} \)

\end{document}

image

mhchem commented 6 years ago

Good question.

When used in running text, \ce will take the current text font to blend it as best as it can. This is what your examples show. But this has nothing to do with this thread.

When used in a math environment, like $m(\ce{NO_x})$, \ce will take the math font. Only in that case, \mathrm is used and the problem of this thread occurs.

u-fischer commented 6 years ago

"the math font" doesn't make sense. \mathrm is not "the" math font, it is one of a variety of math alphabets. And you didn't answer the question why do you use in math \mathrm instead of \symrm or switching to text mode with \text. But imho this is no longer an issue for unicode math and you should perhaps move the discussion to a better place.

mhchem commented 6 years ago

"the math font" means: the set of mathematical alphabets, as you call them.

Scientific typography follows certain rules. Variables are printed in italics, operators etc. are not. This is the reason, why the sinus operator is not $sin$, but printed in upright font. It is the same for chemical elements, they are printed using an upright font. For the same reason \sin uses the upright math font and not \text, mhchem uses the upright math font and not \text. For that, I use \mathrm because that was – and to my understanding still is – the universal way to get the upright math font. \symrm is unicode-math-specific, isn't it?

I still think, it is a unicode-math-related issue, because it is this package that introduces this inconsistency and breaks the behavior of existing documents and packages that now are forced to adapt.

davidcarlisle commented 6 years ago

I still think, it is a unicode-math-related issue, because it is this package that introduces this inconsistency and breaks the > behavior of existing documents and packages that now are forced to adapt.

Well not really, as I mentioned on tex.sx the underlying mechanisms are completely different and unicode-math hides the differences as far as possible but in this case not enough for you not to notice.

in classic tex math usually uses 16 different fonts and \mathrm switches to one of them and 9at the primitive tex level) that switch is not seen by nested math contests so nested math starts off again in italic.

when using a Unicode Math font then (normally) all teh symbols come from a single font and math italic is not just the normal ascii slots in a math italic font the characters are translated up to the plane 1 1Dxxx Unicode Math range, so at the primitive level this is inherited in nested math contexts.

So the difference is not due to the unicode-math package it is because you are using a Unicode math font and Unicode math layout engine.

u-fischer commented 6 years ago

@mhchem I know than \sin is typeset in a upright font. But the font used is \operator@font which often is identical to \mathrm but hasn't to. E.g. with beamer it isn't and so here mhchem gives (with pdflatex and without unicode-math!) a probably unwanted serif upright char:

\documentclass{beamer}
\usepackage[version=4]{mhchem}
\begin{document}
\makeatletter
\begin{frame}
$\ce{A_x}\quad \sin \quad \mathrm{sin}$
\end{frame}
\end{document}

image

Beside this: I don't know much about chemical typesetting but I got the impression that in the tex.sx question align wasn't used to get a math font but to get math alignment.

krishnakumarg1984 commented 6 years ago

Yes. I used align environment to align the equations at their reaction arrows (as per the mhchem manual)

ArchangeGabriel commented 6 years ago

I find it strange to have a different font for chemistry depending whether you are in a math environment or not…

mhchem commented 6 years ago

@davidcarlisle, I think we got lost in the fine-details of language. Sorry, my English is a foreign language for me. I did not mean to say that unicode-math does anything wrong. I just wanted to say the issue we discuss here is related to unicode-math, because, to a user, it looks like "when I load unicode-math, my documents look different than before".

@krishnakumarg1984 Currently, there is no option to use \begin{align} and the text font. Hmm. I always thought short chemical names (like \ce{H2O}) should appear in the running text with the respective text font. And dispayed, aligned equations should use the math font. If a document should look very consistent, then the setup should use a math font and a text font that match very well.

Anyway, I will update mhchem to detect the presence of unicode-math.

@u-fisher thanks for pointing me to the issues with beamer. It really does a strange setup.

@ArchangeGabriel The possibility to use \ce{H2O} in text while it adapts to the current font, is one of the main features of mhchem (i.e. main body, section heading, page header, toc, image caption, etc.).

mhchem commented 6 years ago

Fixed with mhchem v4.08 (2018-06-22), uploaded to CTAN a few minutes ago.