plk / biblatex

biblatex is a sophisticated bibliography system for LaTeX users. It has considerably more features than traditional bibtex and supports UTF-8
520 stars 118 forks source link

Polyglossia messages in the log if variant is used #1393

Open u-fischer opened 3 weeks ago

u-fischer commented 3 weeks ago

When I compile this document with lualatex

\documentclass{book}
\usepackage{polyglossia}
\setmainlanguage[variant=british]{english}
\usepackage[english=british]{csquotes}
\usepackage[]{biblatex} 
\addbibresource{biblatex-examples.bib}
\ExplSyntaxOn
%\xpg_set_hyphenation_patterns:n {british}
\ExplSyntaxOff

\begin{document}
\cite{doody} \cite{herrmann} \cite{aksin}
\printbibliography
\end{document}

I get lots of info messages in the log: In a large bibliography they can easily led to thousands of log lines.

Module polyglossia Info: Language british was not yet loaded; created with id 4 
on input line 95
Module polyglossia Info: Language british already loaded; id is 4 on input line 
95
Module polyglossia Info: Language british already loaded; id is 4 on input line 
95
Module polyglossia Info: Language british already loaded; id is 4 on input line 
95
Module polyglossia Info: Language british already loaded; id is 4 on input line 
95
Module polyglossia Info: Language british already loaded; id is 4 on input line 
95
Module polyglossia Info: Language british already loaded; id is 4 on input line 
95
Module polyglossia Info: Language british already loaded; id is 4 on input line 
95

The message disappear if I uncomment the \xpg_set_hyphenation_patterns:n {british}

I haven't fully traced the code, but imho the problem is that biblatex selects the patterns in the bibliography in a group and so polyglossia constantly redoes a check. So perhaps biblatex should try to load the pattern at least once outside a group.

moewew commented 3 weeks ago

We also have https://github.com/plk/biblatex/issues/1381. Unfortunately, I haven't had the time to look at all of this.

It seems, however, pretty inconvenient to have to issue certain commands at top-level outside any groups to keep polyglossia happy. Most biblatex typesetting is heavily grouped and having to collect stuff to perform outside a group and then executing it there sounds like a major headache.

moewew commented 3 weeks ago

And there is https://github.com/plk/biblatex/issues/1376 which issues a lot of warnings...

\begin{filecontents}{vietnamese.lbx}
\ProvidesFile{vietnamese.lbx}
[\abx@lbxid]

\DeclareBibliographyExtras{%
  \protected\def\bibrangedash{%
    \textendash\penalty\hyphenpenalty}% breakable dash
  \let\finalandcomma=\empty
  \let\finalandsemicolon=\empty
  \def\mkbibordinal{\mkbibmascord}%
  \protected\def\mkbibmascord#1{%
    \stripzeros{#1}\textordmasculine}% \textordmasculine -> textcomp.sty
  \protected\def\mkbibfemord#1{%
    \stripzeros{#1}\textordfeminine}%  \textordfeminine  -> textcomp.sty
  \protected\def\mkbibneutord{\mkbibmascord}%
  \protected\def\mkbibdatelong#1#2#3{%
    \iffieldundef{#3}
      {}
      {\stripzeros{\thefield{#3}}%
       \iffieldundef{#2}{}{\nobreakspace de\space}}%
    \iffieldundef{#2}
      {}
      {\mkbibmonth{\thefield{#2}}%
       \iffieldundef{#1}{}{\nobreakspace de\space}}%
    \iffieldbibstring{#1}
      {\bibstring{\thefield{#1}}}
      {\dateeraprintpre{#1}\stripzeros{\thefield{#1}}}}%
  \protected\def\mkbibdateshort#1#2#3{%
    \iffieldundef{#3}
      {}
      {\mkdayzeros{\thefield{#3}}%
       \iffieldundef{#2}{}{/}}%
    \iffieldundef{#2}
      {}
      {\mkmonthzeros{\thefield{#2}}%
       \iffieldundef{#1}{}{/}}%
    \iffieldbibstring{#1}
      {\bibstring{\thefield{#1}}}
      {\dateeraprintpre{#1}\mkyearzeros{\thefield{#1}}}}%
  \savecommand\mkbibordedition
  \savecommand\mkbibordseries
  \def\mkbibordedition{\mkbibfemord}%
  \def\mkbibordseries{\mkbibfemord}%
  \expandafter\protected\expandafter\def\csname mkbibtime24h\endcsname#1#2#3#4{%
      \iffieldundef{#1}
        {}
        {\mktimezeros{\thefield{#1}}%
         \iffieldundef{#2}{}{\bibtimesep}}%
      \iffieldundef{#2}
        {}
        {\mktimezeros{\thefield{#2}}%
         \iffieldundef{#3}{}{\bibtimesep}}%
      \iffieldundef{#3}
        {}
        {\mktimezeros{\thefield{#3}}}%
      \iffieldundef{#4}{}
        {\bibtimezonesep
         \mkbibtimezone{\thefield{#4}}}}%
  \expandafter\protected\expandafter\def\csname mkbibtime12h\endcsname#1#2#3#4{%
      \stripzeros{\mktimehh{\thefield{#1}}}\bibtimesep
      \forcezerosmdt{\thefield{#2}}%
      \iffieldundef{#3}{}
        {\bibtimesep
         \forcezerosmdt{\thefield{#3}}}%
       \space
       \ifnumless{\thefield{#1}}{12}
         {\bibstring{am}}
         {\bibstring{pm}}%
      \iffieldundef{#4}{}
       {\space\bibtimezonesep
        \parentext{\mkbibtimezone{\thefield{#4}}}}}%
  \protected\def\mkbibseasondateshort#1#2{%
    \mkbibseason{\thefield{#2}}%
    \iffieldundef{#1}{}{\space}%
    \dateeraprintpre{#1}\mkyearzeros{\thefield{#1}}}%
  \protected\def\mkbibseasondatelong#1#2{%
    \mkbibseason{\thefield{#2}}%
    \iffieldundef{#1}{}{\space}%
    \dateeraprintpre{#1}\mkyearzeros{\thefield{#1}}}%
}

\UndeclareBibliographyExtras{%
  \restorecommand\mkbibordedition
  \restorecommand\mkbibordseries
}

\DeclareBibliographyStrings{%
  page             = {{trang}{tr\adddotspace}},
  pages            = {{trang}{tr\adddotspace}},
  urlfrom          = {{Khai thác từ }{Khai thác từ }},
  urlseen          = {{Khai thác từ }{Khai thác từ }},
  backrefpage      = {{trích dẫn ở trang}{trích dẫn ở tr\adddotspace}},
  backrefpages     = {{trích dẫn ở trang}{trích dẫn ở tr\adddotspace}},
}

\endinput
\end{filecontents}

\documentclass{article}

\usepackage{polyglossia}
\usepackage{csquotes}
\setmainlanguage{english}
\setotherlanguage{vietnamese}

\usepackage[style=authoryear,
            backref=true,
            language=auto,
            autolang=other,
            backend=biber]{biblatex}

\begin{filecontents}[overwrite]{\jobname.bib}
@book{Hirsch-Smale:vi,
     author = {Hirsch, M. W. and Smale, S.},
      title = {Phương trình vi phân. Hệ động lực và đại số tuyến tính},
       year = {1979},
      pages = {442},
  publisher = {Đại học và Trung học chuyên nghiệp},
    address = {Hà Nội},
     langid = {vietnamese},
}

@book{Birkhoff-MacLane:vi,
     author = {Birkhoff, Garett and Mac Lane, Saunders},
      title = {Tông quan về Đại số hiện đại},
  publisher = {Nhà xuất bản Đại học và Trung học chuyên nghiệp},
    address = {Hà Nội},
       year = {1979},
       pages = {217},
     langid = {vietnamese},
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}

Citing \textcite{Hirsch-Smale:vi} and \textcite{Birkhoff-MacLane:vi}.

\nocite{*}
\printbibliography

\end{document}
u-fischer commented 3 weeks ago

@moewew I opened now also an issue at polyglossia, https://github.com/reutenauer/polyglossia/issues/669. Perhaps a solution can be found.

Udi-Fogiel commented 2 weeks ago

@moewew It should be fixed, but why are you repeatedly test for the existence of hyphenation patterns? Either they exists, or they don't in a TeX installation, testing a second time wouldn't change that.

moewew commented 2 weeks ago

@Udi-Fogiel Those test come from higher-level commands that need to know if hyphenation patterns are available. Those commands can potentially be issued in different orders and at different times, so there is a priori no way of knowing if we've tested before. Keeping a list of languages we've tested and the answer polyglossia would avoid this, but seems like a lot of extra work (but what do we do if some other package needs to test this? Apparently at least when the commands were initially implemented there was no indication that the hyphenation test would be single-use only.)

Udi-Fogiel commented 2 weeks ago

Keeping a list of languages we've tested and the answer polyglossia would avoid this, but seems like a lot of extra work

There might be other solution, depends of the code. I don't really use biblatex much, so it is hard for me to say. Can you point me to these commands in the source code?

(but what do we do if some other package needs to test this? Apparently at least when the commands were initially implemented there was no indication that the hyphenation test would be single-use only.)

The test is not single use, and the repeated messages were definitely a bug in polyglossia. It is just that @u-fischer mentioned that she get these lines thousands of times, I'm wondering if that many tests are needed and how that impact performance.

moewew commented 2 weeks ago

@Udi-Fogiel It's probably enough to search for \blx@ifhyphenationundef in https://github.com/plk/biblatex/blob/dev/tex/latex/biblatex/biblatex.sty to get a first idea. You might have to unpack/unravel quite some macro chains for the full picture, though.

I haven't had too close a look, but I think the language switching commands indirectly issue the test, which means that whenever biblatex does a \begin{<lang>}\end{<lang>} it issues the test. Throw in a couple of tests for some .lbx files and the hyphenation exceptions and you end up with quite a lot of these messages.

u-fischer commented 2 weeks ago

@moewew, @Udi-Fogiel I do not really have the time currently to research if or if not the tests (and the following messages) are needed and why they are triggered. But this is a single language document and the single language is properly declared in the preamble and it feels very odd that this triggers lots of language switches and/or language tests. Note also that the problem is the variant. If one use simply english (or babel + british) there is no problem.

Udi-Fogiel commented 2 weeks ago

@moewew, @Udi-Fogiel I do not really have the time currently to research if or if not the tests (and the following messages) are needed and why they are triggered. But this is a single language document and the single language is properly declared in the preamble and it feels very odd that this triggers lots of language switches and/or language tests.

As I mentioned, the messages from polyglossia should be gone (well, not the first one) with the master branch (but not the ones mentioned in #1381, as these are messages from biblatex, but there is a PR which fixes it).

As for the switches and tests, other than writing language switches to the standard auxillary files (toc, lof, lot) when a language switch was happened from another source, polyglossia does not switch a language, nor I believe it tests anything.

Note also that the problem is the variant. If one use simply english (or babel + british) there is no problem.

What problem? The messages or the tests and switches? I don't have an example to test with, but I believe the tests are performed regardless of whether polyglossia or babel is used, and polyglossia is not in control of theses tests.

moewew commented 2 weeks ago

Seems that the main perpetrator of these particular messages is

https://github.com/plk/biblatex/blob/4c9a2a83167347970a8fdf4fdf0a86af2641fec4/tex/latex/biblatex/biblatex.sty#L6834-L6844

which is called whenever \bibstring and friends are used.

I don't really understand why it is needed, but it's been with us since PL.