reutenauer / polyglossia

An alternative to Babel for XeLaTeX and LuaLaTeX
http://www.ctan.org/pkg/polyglossia
MIT License
190 stars 51 forks source link

Bug in polyglossia when used with pdfcomment #600

Closed pauloney closed 1 year ago

pauloney commented 1 year ago

The following snippet shows a bug in polyglossia when used together with pdfcomment, under xelatex. It works fine with lualatex.

\documentclass{report}

\usepackage{polyglossia}
%  \setdefaultlanguage{english}
  \setdefaultlanguage{portuguese}

\usepackage{xcolor}
\usepackage{pdfcomment}

\begin{document}

This should show in yellow: \pdfmarkupcomment[markup=Highlight,color=yellow]{Mittag-Leffler}{Highlight} after running \LaTeX{} three times.

\end{document}
jspitz commented 1 year ago

The splithyphens code and soul (used by pdfcomment) do not play well, see https://tex.stackexchange.com/q/660758/2388.

You can work around it via splithyphens=false globally or locally, as in:

% !TeX TS-program = xelatex
\documentclass{report}

\usepackage{polyglossia}
%  \setdefaultlanguage{english}
\setdefaultlanguage{portuguese}

\usepackage{xcolor}
\usepackage{pdfcomment}

\begin{document}

This should show in yellow:
\textportuguese[splithyphens=false]{\pdfmarkupcomment[markup=Highlight,color=yellow]{Mittag-Leffler}{Highlight}}
after running \LaTeX{} three times.

\end{document}

Not sure we can do anything here at polyglossia side, or whether this is something that needed to be addressed in soul.

The same bug applies to other languages with splithyphens, and XeTeX.

jspitz commented 1 year ago

Actually, the fix was astonishingly trivial. Please test and report back, if possible.

Udi-Fogiel commented 1 year ago

@jspitz It should be \let\xpg@hyphen@char- instead of \edef\xpg@hyphen@char{-}, otherwise the test \ifx\portuguese@sh@next\xpg@hyphen@charx is always false, as \portuguese@sh@next is being \let to -, so it is considered of different character code than a macro.

For a simple example, consider the output of

\def\a{-}
\let\b-

\ifx\a\b True\else False\fi

\ifx\a- True\else False\fi

\ifx\b- True\ese False\fi

\bye

Maybe a simpler fix would be to get rid of \xpg@hyphen@char and replace the test \ifx\portuguese@sh@next\xpg@hyphen@charx with \ifx\portuguese@sh@next-.

The current commit breaks en-dash and em-dash ligatures:

\documentclass{article}
\usepackage{polyglossia}
\setdefaultlanguage{portuguese}

\begin{document}
    a--b

    a---b
\end{document}

Another problem is that it could potentially break backwards compatibility (even with the suggested fix). Since \if expands the tokens, while \ifx is not, the following would produce different output before and after the commit

\documentclass{article}
\usepackage{polyglossia}
\setdefaultlanguage{portuguese}

\begin{document}
    \def\a{-}

    a-\a b
\end{document}

You can add an option to splithyphens to use \ifx instead of \if to keep backwards compatibility.

jspitz commented 1 year ago

@jspitz It should be \let\xpg@hyphen@char- instead of \edef\xpg@hyphen@char{-}, otherwise the test \ifx\portuguese@sh@next\xpg@hyphen@charx is always false, as \portuguese@sh@next is being \let to -, so it is considered of different character code than a macro.

I see. Thanks for the instruction.

Maybe a simpler fix would be to get rid of \xpg@hyphen@char and replace the test \ifx\portuguese@sh@next\xpg@hyphen@charx with \ifx\portuguese@sh@next-.

No, this is what interferes with soul.

Another problem is that it could potentially break backwards compatibility (even with the suggested fix). Since \if expands the tokens, while \ifx is not, the following would produce different output before and after the commit

\documentclass{article}
\usepackage{polyglossia}
\setdefaultlanguage{portuguese}

\begin{document}
  \def\a{-}

  a-\a b
\end{document}

You can add an option to splithyphens to use \ifx instead of \if to keep backwards compatibility.

I think this is overkill.

Udi-Fogiel commented 1 year ago

No, this is what interferes with soul.

I don't think so. I believe the origin of the problem is the fact that \if expand tokens. Simply replacing \if with \ifx in the original code fix the conflict with soul.

I think this is overkill.

Time will tell best. If users will notice a change in the output it should be considered.

jspitz commented 1 year ago

No, this is what interferes with soul.

I don't think so. I believe the origin of the problem is the fact that \if expand tokens. Simply replacing \if with \ifx in the original code fix the conflict with soul.

Yes, you're right. I will leave things as they are now, though.

I think this is overkill.

Time will tell best. If users will notice a change in the output it should be considered.

Sure. For now, I consider the problem to be rather theoretical.

jspitz commented 1 year ago

V. 1.64 with this fix (only) has been submitted to CTAN.