Closed hpreusse closed 1 year ago
Thanks for the report!
I'd ask @brucemiller to take a look here, as he has a nice setup for the most recent texlive snapshots (which I admit is just a convenient excuse on my part, since I could make arrangements for that myself...)
In trying to avoid opening a second babel-related issue, let me also report something I am seeing on texlive 2022 with the latest LaTeXML of today.
A minimal load of english babel, as in:
\documentclass{article}
\usepackage[english]{babel}
\begin{document}
\end{document}
leads to ten repeated errors of the kind:
Warning:uninitialized:$ch Use of uninitialized value $ch in hash element at at /home/deyan/perl5/lib/perl5/LaTeXML/Core/Mouth.pm line 163, <$IN> line 9
Warning:uninitialized:$_[1] Use of uninitialized value $_[1] in hash element at at /home/deyan/perl5/lib/perl5/LaTeXML/Core/Mouth.pm line 279, <$IN> line 9
Warning:uninitialized:$_[1] Use of uninitialized value $_[1] in hash element at at /home/deyan/perl5/lib/perl5/LaTeXML/Core/Mouth.pm line 279, <$IN> line 9
Warning:uninitialized:value Use of uninitialized value value in string eq at at /home/deyan/perl5/lib/perl5/LaTeXML/Core/Token.pm line 323, <$IN> line 9
Warning:uninitialized:value Use of uninitialized value value in string eq at at /home/deyan/perl5/lib/perl5/LaTeXML/Core/Token.pm line 323, <$IN> line 9
Warning:uninitialized:value Use of uninitialized value value in string eq at at /home/deyan/perl5/lib/perl5/LaTeXML/Core/Token.pm line 323, <$IN> line 9
Error:expected:Until:= Missing argument Until:= for Core::Definition::Expandable[\bbl@inistore@min Until:=Until:\@@] at babel-english.tex; line 13 col 0
for a total summary of
Conversion complete: 51 warnings; 10 errors
The 51 warnings and 10 errors come from acaa51d9bdf4ae582c57f527fe34580290efac34
But even rolling that back, I don't get the correct output from
\documentclass{article}
\usepackage[polutonikogreek,english]{babel}
\begin{document}
english
\selectlanguage{polutonikogreek}
greek
\selectlanguage{english}
english
\end{document}
Prior to the commit in question, I don't get any errors, but the final "english" is in greek.
I now have a 2023 texlive installed in parallel on my machine and double-checked if my claim that #2215 resolves the regression on texlive 2022 extends to the current release.
Sadly it doesn't - while the simple english
babel load from my previous comment now succeeds, the greek.tex
test still fails with the message reported in the original issue description.
There is a separate regression in t/structure/glossary
:
# Difference at line 92 for t/structure/glossary
# got : ' <p>Or, more loudly: Chop the <glossaryref inlist="main" key="cabbage">cabbage</glossaryref>, <glossaryref inlist="main" key="potato">potatoes</glossaryref> and <glossaryref inlist="main" key="carrot">carrots</glossaryref>.</p>'
# expected : ' <p>Or, more loudly: Chop the <glossaryref inlist="main" key="cabbage">CABBAGE</glossaryref>, <glossaryref inlist="main" key="potato">POTATOES</glossaryref> and <glossaryref inlist="main" key="carrot">CARROTS</glossaryref>.</p>'
As well as the cautionary observation that make test
took 21 minutes on my high-end desktop machine. Maybe we want to caution latexml users to stick to texlive 2022 for a little bit longer?
I will reopen here and check whether I can find a patch for the two failing tests.
Maybe we want to caution latexml users to stick to texlive 2022 for a little bit longer?
That might not be enough: apparently master fails on my TeX Live 2022 (nixpkgs), I also get
# Difference at line 41 for t/babel/greek
# got : '<p>Ηερε´ς'
# expected : '<p>Here’s'
It took way too long to troubleshoot the exact cause here, apologies, but it (appears to) come down to a single difference in the way we implement \cf@encoding
in latexml, compared to latex.ltx.
Namely, there is a comparison implied by \bbl@switch
in texlive 2023, which in the case of transitioning from Greek back to English will execute an \ifx \cf@encoding \BabelGreekPreviousFontEncoding
. There is exactly on place in the expansion flow of our test, where the pdflatex run has \cf@encoding
defined as macro:->LGR
, while \BabelGreekPreviousFontEncoding
is defined as macro:->OT1
, and we proceed to take the \else
case of the conditional - which sets the language back to English.
Meanwhile, latexml implements \cf@encoding
as a Perl sub{}
, which checks the current font directly. That means that a \let\foo\cf@encoding
will always bind \foo
to the same definition (pointing to that sub{}
), whether the font has changed or not. Hence our test failing.
So it appears one approach to a solution is more precisely implementing \cf@encoding
and its related internal latex.ltx macros (or waiting until we can load all of latex.ltx natively).
Another approach is to implement our own version of \bbl@switch
or even more specifically, \BabelGreekRestoreFontEncoding
.
Recording my current notes here, and I can follow with a PR once we choose a strategy.
P.S. Since the definition is dynamically constructed as babel executes, let me include it in the comment (logged via \tracingmacros=1
):
\BabelGreekRestoreFontEncoding ->\ifx \cf@encoding \BabelGreekPreviousFontEncoding \else
\let \encodingdefault \BabelGreekPreviousFontEncoding \fontencoding {\encodingdefault}
\selectfont \fi
I checked out TeX Live another time at end of June and noticed that LaTeXML again fails to run a test using that TL snapshot.
This is reproducible with a git clone of LaTeXML I made this morning. The full build log can be seen here.
Sorry for the bad news! Thanks for help.