latex3 / unicode-math

XeLaTeX/LuaLaTeX package for using unicode/OpenType maths fonts
http://ctan.org/pkg/unicode-math
LaTeX Project Public License v1.3c
241 stars 28 forks source link

Grouping in sub/superscripts required after last update #448

Closed projekter closed 5 years ago

projekter commented 6 years ago

Description

After updating today to the most recent unicode-math version (MikTeX experimental), the \sym-commands suddenly require to open a group for a single-letter sub/superscript. MWE that worked before:

\documentclass{article}

\usepackage{unicode-math}

\begin{document}
   $p_\symup x$
\end{document}

Now complains: ! Missing { inserted. \__um_group_begin: l.6 $p_\symup x $. This can be circumvented easily by writing p_{\symup x}, but it would be great if the braceless version worked again.

Check/indicate

wspr commented 6 years ago

Ah; apologies, I missed that some changes I made would cause this to happen.

However, I doubt this will be fixed; this syntax isn’t formally part of LaTeX and other packages like breqn also don’t allow omitting braces here — over time I imagine there will be yet more reason to require the strict syntax.

(Well, I can see one way to overcome the problem, which would involve making _ and ^ active and scanning ahead for font-changing commands, but I don’t think explicitly supporting this syntax would be seen as a good thing more broadly.)

projekter commented 6 years ago

At least in standard LaTeX, I can always say $p_\mathrm x$; if you argue that the \sym... command should syntactically behave as their non-unicode-math counterparts, then the braceless syntax should be supported. But of course, if it makes moving forward difficult, I will perform some greps and change all my documents.

davidcarlisle commented 6 years ago

At least in standard LaTeX, I can always say $p_\mathrm x$;

You can, but it was an unfortunate accident of the implementation that was kept basically undocumented for compatibility reasons. Most similar constructs, such as p_\mbox x would fail. I don't think new commands should follow this quirk, one notable issue with it is that it makes determining the subscript virtually impossible without a full tex parser so most latex to xxx convertors fail on such expressions.

On 29 January 2018 at 11:49, Benjamin Desef notifications@github.com wrote:

At least in standard LaTeX, I can always say $p_\mathrm x$; if you argue that the \sym... command should syntactically behave as their non- unicode-math counterparts, then the braceless syntax should be supported. But of course, if it makes moving forward difficult, I will perform some greps and change all my documents.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/wspr/unicode-math/issues/448#issuecomment-361221955, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNcAio92jhSRiJAihuGnreACMpbHFv2ks5tPbAygaJpZM4RwX7r .

eg9 commented 6 years ago

@projekter You can also say \mbox\bgroup x\egroup and the result, by pure chance, is much similar to \mbox{x}, only for implementation reasons. Also \mbox x works, but is obscure.

The correct syntax is, and has always been,

p_{\mathrm{x}}

The fact that p_\mathrm x works does not justify using it.

There are several good reasons for always using \mathrm{x} (with braces), no good one for \mathrm x (less keystrokes is no good reason).

I usually mention $X_\notin$, at this point.

bgvoisin commented 6 years ago

Same here: input that typeset fine two days ago no longer does, after applying the February 2 update. I narrowed it down to input like $a_\mathrm{b}$ that I have been using happily for 24 years or so, since the first release of LaTeX 2e.

What's strange is the way that happens, with minimal example

\documentclass{article}
\usepackage{unicode-math}
\begin{document}
$a_\mathrm{b}$
$a_\mathrm{b}$
\end{document}

With it, the console complains

./unicode-math-bug.tex:7: Missing { inserted.
<to be read again> 
                   \__um_group_begin: 
l.7 $a_\mathrm
              {b}$

Comment out either of the two $a\mathrm{b}$, everything's just fine. It seems something's not reinitialized properly after the first, in a way that affects the second. Of course, using $a{\mathrm{b}}$ solves the matter, but it will be hard to break a two-decade-long habit, especially for code that works just fine with pdfLaTeX.

eg9 commented 6 years ago

@bgvoisin You're starting from a false premise: the code seems to work with pdflatex, but it does so only by chance.

wspr commented 6 years ago

@bgvoisin — the interesting (or I guess amusing) part this is that this subscript behaviour change change came about because of your report that \symbf{\cdot} had "ord" spacing instead of "bin".

Internally, \mathbf and so on use \bgroup and \egroup, and I'd copied this approach for symbf etc. And it's \bgroup/\egroup which allow a_\mathbf{b} to work. However, they also have the side-effect that they cause a change in spacing — essentially you're ending up with something like {\mathbf{...}} in terms of how TeX sees the mathematical "piece" of the equation.

Having said this, the side-effect was unintentional and I know a LOT of LaTeX users seem to use this syntax.

It's a tough one. I wonder if I should add an option to control this behaviour, or work harder to preserve some level of compatibility by using math-active characters and scanning ahead.

davidcarlisle commented 6 years ago

On 7 February 2018 at 06:00, Will Robertson notifications@github.com wrote:

@bgvoisin https://github.com/bgvoisin — the interesting (or I guess amusing) part this is that this subscript behaviour change change came about because of your report that \symbf{\cdot} had "ord" spacing instead of "bin".

Internally, \mathbf and so on use \bgroup and \egroup, and I'd copied this approach for symbf etc. And it's \bgroup/\egroup which allow a_\mathbf{b} to work. However, they also have the side-effect that they cause a change in spacing — essentially you're ending up with something like {\mathbf{...}} in terms of how TeX sees the mathematical "piece" of the equation.

Having said this, the side-effect was unintentional and I know a LOT of LaTeX users seem to use this syntax.

It's a tough one. I wonder if I should add an option to control this behaviour, or work harder to preserve some level of compatibility by using math-active characters and scanning ahead.

You may (possibly) want to consider some kind of compatibility option for \mathbf and friends (but probably I wouldn't) but I would strongly urge that you don't do this for new commands introduced in this package like \symbf, there is no reason that x\symbf{y} shouldn't work like x{\symbf}{y} and give an error, making it not do that would complicate lots of other processes you might want to do on the input as it makes determining the subscripts without actually typesetting that much harder, apart from which \mathxx being a mathord is reasonable as it's intended for a run of characters from a text font so reasonable that they and not \symxx have the outer brace group.

JackGin commented 6 years ago

@wspr By making this change you are essentially breaking the compilation of a huge corpus of texts, which millions of users have been creating for dozens of years, because of your conceptual vision. This is an unprecedented move as the compatibility has always been the most important value of LaTeX. Please bring back the established behaviour, at least by adding some kind of compatibility option for your package. Thank you.

jcsalomon commented 6 years ago

@JackGin, those texts will continue to work, so long as unicode-math isn’t added to them.

davidcarlisle commented 6 years ago

@jackGin the \symxx commands have only been in unicode-math for a couple of years so you are exaggerating the issue somewhat to claim that they have been used by millions of users for dozens of years.

On 13 February 2018 at 21:12, Joel C. Salomon notifications@github.com wrote:

@JackGin https://github.com/jackgin, those texts will continue to work, so long as unicode-math isn’t added to them.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/wspr/unicode-math/issues/448#issuecomment-365405820, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNcAoJrI_F6LNKgTf_QrK6Y3M_B89X0ks5tUfrIgaJpZM4RwX7r .

JackGin commented 6 years ago

@jcsalomon I think it is common for everyone to borrow some formulas from the old texts, and even transfer large pieces to a newer file when preparing e.g. lectures or conference presentations with LuaLaTeX. Now this becomes quite troublesome, as the compatibility is broken.

@davidcarlisle I am talking about \mathrm command, which has been around like forever. And all these niceties like $V_\mathrm{out}$ are now dead and gone.:(

wspr commented 6 years ago

@davidcarlisle / @bgvoisin / @JackGin — I had half a solution already in place in the code, it's easy enough to revert the change for \mathrm and friends only.

I agree with others that the \symXX commands shouldn't support this dubious syntax.

netw0rkf10w commented 6 years ago

Hello,

This problem also occurs for other commands such as \mathcal or \mathbf.

% !TeX program = xelatex
\documentclass[12pt,a4paper]{article}
\usepackage[math-style=ISO]{unicode-math}

\begin{document}
$a_\mathcal{S}$ % not working

$a^\mathcal{S}$ % not working

$a_{\mathcal{S}}$ % with brackets, working

$a_\mathbf{S}$ % boldface is working

$a^\mathbf{s}(x)$ % not working

$a_\mathbf{s}^x$ % not working

$a^{\mathbf{s}}(x)$ % working

$a_{\mathbf{s}}^x$ % working
\end{document} 

Posted here: https://tex.stackexchange.com/questions/434600/tex-live-2018-unicode-math-does-not-allow-using-mathcal-or-mathbf-as-subscrip

bgvoisin commented 6 years ago

Originally it was {\bf x} in Plain TeX and LaTeX 2.09, so the braces were there necessarily. Then came LaTeX 2e, and either I typed \mathbf{x} accidentally and realized it worked, or I saw it somewhere in an example and decided to use it, or I guessed that, being a command not a declaration, \mathbf{} should enclose its argument between matching {} or \bgroup\egroup or \begingroup\endgroup, but the fact is that for years now all my documents have included input like \mathbf{x} not {\mathbf{x}}, to save keystrokes, on the understanding that \mathbf{} and the like did made their argument a group.

Changing everything isn't that difficult, but it takes time. For example last week I used a report from two years ago as the basis to start writing a paper, and the first thing was to parse each and every _ and ^ to insert braces wherever needed.

I do agree that the proper LaTeX 2e way has always been _{\mathbf{x}}. But people not always used it in existing documents (most often they didn't). Or maybe I'm just old-fashioned: having learnt or used plain TeX for a couple of years before switching to LaTeX, I'm still typing my accents (when the input needs to be 7-bit ASCII) the plain TeX way like \'e and \c c, for example, not the LaTeX way \'{e} and \c{c}.

I thought this was just a matter of making sure the output of \mathbf{} and the like was enclosed between \bgroup and \egroup, which seemed natural and easy. But Will said in an earlier comment this would change spacing. Here I must say that I don't see how, but given the complexity of Unicode LaTeX (compared to the 8-bit LaTeX I'm familiar with) I take Will's word for it.

All this to say, if it has to be {\mathbf{x}}, OK it's not that bad, but \mathbf{x} had emerged as a de facto standard so it won't be easy.

davidcarlisle commented 6 years ago

@bgvoisin I accept some people use that form but _\mathrm{x} is simply horrible markup that relies on low level parsing differences between _ and a macro argument that you can define at the macro layer.

It's not supported in breqn or most tex 2 html convertors for the simple reason that it is really hard to define a macro that reads arguments that way. Any macro you define as \def\foo#1{...#1...} then you need \foo{\mathrm{x}} the fact that \mathrm internally adds a brace group does not mean that you can do \foo\mathrm{x} so practically speaking supporting _\mathrm means that you can not have anything other than the tex-primitive _ you can't redefine it to do anything extra.

If Will can make it work in unicode-math for compatibility's sake then I wouldn't object, but no user should be using this form in any document.

bgvoisin commented 6 years ago

Thanks David for taking the time to answer. I didn't think about the use of parsers like tex 2 html. I also thought the fact \mathrm added a brace group meant \foo\mathrm{x} was OK; but thinking more about it, this assumes \mathrm is expanded before \foo in all circumstances, and there are probably other side effects I'm not seeing.