plk / biber

Backend processor for BibLaTeX
Artistic License 2.0
335 stars 37 forks source link

New biber+biblatex breaks with multiple braces #297

Closed jakobmoss closed 4 years ago

jakobmoss commented 4 years ago

With the new versions of biber (v2.14) and biblatex (v3.14) my bibliography is broken. I experience an error of the form:

...

(./jlr_thesis.bbl
Runaway argument?
{{{hash=73bc21ba4f8e096ab6eff8d0e9cd0f77}{family={{Huber}}, familyi={\ETC.
! Paragraph ended before \name was complete.
<to be read again>
\par
l.30481
)

...

This then breaks the rest of the document. I have not updated the .bib-file, only my TeXLive installation.

It is exactly identical to what is described in the older issue here: https://tex.stackexchange.com/questions/414685/paragraph-ended-before-name-was-complete . I too use sources from ADS (and the BibDesk referencing software), which unfortunately produces an excessive amount of braces.

The issue must have resurfaced in the new release(s), as it used to work.

Clarification: I have just verified, that it does indeed work with the older releases (biber v.2.13 and biblatex v3.13) by manually downgrading and downloading the old binary. Same input, no errors.

plk commented 4 years ago

Hmm, this must be a regression due to some changes in the recode module. Can you give me an example .bib entry that breaks things and I'll see if I can fix this in the new recode module and put in a regression test.

moewew commented 4 years ago

Take the following simplification of the linked MWE

\documentclass{article}

\usepackage[backend=biber, style=numeric]{biblatex}

\begin{filecontents}[force]{\jobname.bib}
@ARTICLE{2013APh....50...26A,
       author = {{{\'A}lvarez}, J.~D.},
        title = "{Sensitivity of the high altitude water Cherenkov detector to sources of multi-TeV gamma rays}",
      journal = {Astroparticle Physics},
     keywords = {TeV gamma-ray astronomy, Water cherenkov, Cosmic ray, Astrophysics - High Energy Astrophysical Phenomena},
         year = "2013",
        month = "Dec",
       volume = {50},
        pages = {26-32},
          doi = {10.1016/j.astropartphys.2013.08.002},
archivePrefix = {arXiv},
       eprint = {1306.5800},
 primaryClass = {astro-ph.HE},
       adsurl = {https://ui.adsabs.harvard.edu/abs/2013APh....50...26A},
      adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
Lorem\autocite{2013APh....50...26A}
\end{document}

the .bbl has

      \name{author}{1}{}{%
        {{hash=b62f5d285a4258b6563b6df970d4caa5}{%
           family={{{Á}lvarez}},
           familyi={{\bibinitperiod},
           given={J.\bibnamedelimi D.},
           giveni={J\bibinitperiod\bibinitdelim D\bibinitperiod}}}%
      }

So the initial is generated incorrectly. I guess this has to do with not dropping braces any more (I liked it that Biber 2.13 dropped these braces, but I know that the fact that braces are so overloaded with meaning makes it hard to know what should be done): https://github.com/plk/biber/commit/7ffd6f070b31e05a25f62d10fd2d8a38a4eb5639.

jakobmoss commented 4 years ago

Thanks for the quick response!

As far as I know, I encode all of the 'special' characters in unicode instead of using TeX commands - I made this change trying to fix the issue before downgrading.

I have just checked, and the entries does indeed seem to be unicode, e.g.:

@article{vogler03a,
        Author = {{Vögler}, A. and {Schüssler}, M.},
        Doi = {10.1002/asna.200310146},
        Journal = {Astronomische Nachrichten},
        Keywords = {Sun, stars: magnetic fields, stars: activity, methods: numerical},
        Month = jan,
        Number = {4},
        Pages = {399-404},
        Title = {{Studying magneto-convection by numerical simulation}},
        Volume = {324},
        Year = {2003}
}

I don't know exactly which entry breaks things (and I don't know if the quoted entry works or not). I can provide you with my entire .bib-file if that helps, but that contains around 200 entries, and might not be super helpful in that respect...

I will try to track the problematic entry down, when I have handed in my thesis (currently I just need things to work, so I keep the downgraded version in my production environment). Hopefully next week.

moewew commented 4 years ago

The best way to write this entry would be to drop all unnecessary braces and use Unicode as you said:

@article{vogler03a,
        Author = {Vögler, A. and Schüssler, M.},
        Doi = {10.1002/asna.200310146},
        Journal = {Astronomische Nachrichten},
        Keywords = {Sun, stars: magnetic fields, stars: activity, methods: numerical},
        Month = jan,
        Number = {4},
        Pages = {399-404},
        Title = {Studying magneto-convection by numerical simulation},
        Volume = {324},
        Year = {2003}
}
jakobmoss commented 4 years ago

Oh, I know. And I fully agree that the entries in my bibliography might not be fully optimal. The problem is that I obtain my references from ADS (The SAO/NASA Astrophysics Data System), which is the de-facto standard in astronomy. Literally everyone I know in astronomy/astrophysics rely on ADS to obtain their references.

The issue is that the bibtex-export from ADS is somewhat iffy (see below). And manually fixing close to 200 entries is quite an endeavour.....

The raw entry from ADS (https://ui.adsabs.harvard.edu/abs/2003AN....324..399V/exportcitation) looks like this:

@ARTICLE{2003AN....324..399V,
       author = {{V{\"o}gler}, A. and {Sch{\"u}ssler}, M.},
        title = "{Studying magneto-convection by numerical simulation}",
      journal = {Astronomische Nachrichten},
     keywords = {Sun, stars: magnetic fields, stars: activity, methods: numerical},
         year = "2003",
        month = "Jan",
       volume = {324},
       number = {4},
        pages = {399-404},
          doi = {10.1002/asna.200310146},
       adsurl = {https://ui.adsabs.harvard.edu/abs/2003AN....324..399V},
      adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}
plk commented 4 years ago

Should be resolved on the DEV branch along with a regression test.

moewew commented 4 years ago

The MWE I posted above compiles fine with Biber 2.15 dev.

lujiajing1126 commented 4 years ago

I compiled with the latest dev branch, still not woking for a special case, which is also from NASA/ADS, https://ui.adsabs.harvard.edu/abs/1995ZPhyA.352..355R

@ARTICLE{1995ZPhyA.352..355R,
       author = {{Ritman}, J.~L. and {Herrmann}, N. and {Best}, D. and {Alard}, J.~P. and
         {Amouroux}, V. and {Bastid}, N. and {Belyaev}, I. and {Berger}, L. and
         {Biegansky}, J. and {Buta}, A. and {{\v{C}}aplar}, R. and {Cindro}, N. and
         {Coffin}, J.~P. and {Crochet}, P. and {Dona}, R. and {Dupieux}, P. and
         {Dzelalija}, M. and {Fintz}, P. and {Fodor}, Z. and
         {Genoux-Lubain}, A. and {Gobbi}, A. and {Goebels}, G. and
         {Guillaume}, G. and {Grigorian}, Y. and {H{\"a}fele}, E. and {Hildenbrand
        }, K.~D. and {H{\"o}lbling}, S. and {Jundt}, F. and {Kecskemeti}, J. and
         {Kirejczyk}, M. and {Korchagin}, Y. and {Kotte}, R. and {Kuhn}, C. and
         {Lambrecht}, D. and {Lebedev}, A. and {Legrand}, I. and {Leifels}, Y. and
         {Maazouzi}, C. and {Manko}, V. and {Matulewicz}, T. and
         {M{\"o}sner}, J. and {Mohren}, S. and {Moisa}, D. and {Neubert}, W. and
         {Pelte}, D. and {Petrovici}, M. and {Pinkenburg}, C. and {Rami}, F. and
         {Ramillien}, V. and {Reisdorf}, W. and {Roy}, C. and {Sch{\"u}ll}, D. and
         {Seres}, Z. and {Sikora}, B. and {Simion}, V. and
         {Siwek-Wilczy{\'n}ska}, K. and {Smolyankin}, V. and {Sodan}, U. and
         {Tizniti}, L. and {Trzaska}, M. and {Vasiliev}, M.~A. and {Wagner}, P. and
         {Wang}, G.~S. and {Wienold}, T. and {Wohlfarth}, D. and {Zhilin}, A.},
        title = "{On the transverse momentum distribution of strange hadrons produced in relativistic heavy ion collisions}",
      journal = {Zeitschrift fur Physik A Hadrons and Nuclei},
     keywords = {Nuclear Experiment},
         year = "1995",
        month = "Dec",
       volume = {352},
       number = {4},
        pages = {355-357},
          doi = {10.1007/BF01299750},
archivePrefix = {arXiv},
       eprint = {nucl-ex/9506002},
 primaryClass = {nucl-ex},
       adsurl = {https://ui.adsabs.harvard.edu/abs/1995ZPhyA.352..355R},
      adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

The error is caused by {{\v{C}}aplar}, from which the wrong bbl file is generated,

        {{hash=38f5b170a907e76a7590993b792c843b}{%
           family={{{Č}aplar}},
           familyi={{\bibinitperiod},   <----- error
           given={R.},
           giveni={R\bibinitperiod}}}%
moewew commented 4 years ago

MWE

\documentclass[british]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[style=authoryear, backend=biber]{biblatex}

\begin{filecontents}[force]{\jobname.bib}
@ARTICLE{lorem,
       author = {{{\v{C}}aplar}, R.},
        title = {On the transverse momentum distribution of strange hadrons produced in relativistic heavy ion collisions},
      journal = {Zeitschrift fur Physik A Hadrons and Nuclei},
     keywords = {Nuclear Experiment},
         date = {1995-12},
       volume = {352},
       number = {4},
        pages = {355-357},
          doi = {10.1007/BF01299750},
archivePrefix = {arXiv},
       eprint = {nucl-ex/9506002},
 primaryClass = {nucl-ex},
}
\end{filecontents}
\addbibresource{\jobname.bib}
\addbibresource{biblatex-examples.bib}

\begin{document}
\cite{sigfridsson,lorem}
\printbibliography
\end{document}

I can reproduce the issue with Biber 2.15 dev from 2019-01-08 (SourceForge).

plk commented 4 years ago

Should be fixed in DEV.

moewew commented 4 years ago

Works fine now. I notice that from

\documentclass[british]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[style=authoryear, backend=biber]{biblatex}

\begin{filecontents}[force]{\jobname.bib}
@ARTICLE{lorem,
       author = {{{\v{C}}aplar}, R. and {\v{C}}aplar, S. and {\v C}aplar, T.},
        title = {Title},
      journal = {Journal},
         date = {1995-12},
       volume = {352},
       number = {4},
}
\end{filecontents}
\addbibresource{\jobname.bib}
\addbibresource{biblatex-examples.bib}

\begin{document}
\cite{sigfridsson,lorem}
\printbibliography
\end{document}

we get

      \name{author}{3}{}{%
        {{un=1,uniquepart=given,hash=38f5b170a907e76a7590993b792c843b}{%
           family={{{Č}aplar}},
           familyi={Č\bibinitperiod},
           given={R.},
           giveni={R\bibinitperiod},
           givenun=1}}%
        {{un=1,uniquepart=given,hash=ffe886b9b55aaa5efea1670c7598160b}{%
           family={{Č}aplar},
           familyi={Č\bibinitperiod},
           given={S.},
           giveni={S\bibinitperiod},
           givenun=1}}%
        {{un=0,uniquepart=base,hash=72a2e7e98918e2a9fea15b939c046c5c}{%
           family={Čaplar},
           familyi={Č\bibinitperiod},
           given={T.},
           giveni={T\bibinitperiod},
           givenun=0}}%
      }

I.e. there is a pair of braces around the Č in the first two but not in the third case.

Would it be possible to strip these unwanted braces so that we get Č from {\v{C}} as well

      \name{author}{3}{}{%
        {{un=1,uniquepart=given,hash=38f5b170a907e76a7590993b792c843b}{%
           family={{Čaplar}},
           familyi={Č\bibinitperiod},
           given={R.},
           giveni={R\bibinitperiod},
           givenun=1}}%
        {{un=1,uniquepart=given,hash=ffe886b9b55aaa5efea1670c7598160b}{%
           family={Čaplar},
           familyi={Č\bibinitperiod},
           given={S.},
           giveni={S\bibinitperiod},
           givenun=1}}%
        {{un=0,uniquepart=base,hash=72a2e7e98918e2a9fea15b939c046c5c}{%
           family={Čaplar},
           familyi={Č\bibinitperiod},
           given={T.},
           giveni={T\bibinitperiod},
           givenun=0}}%
      }
plk commented 4 years ago

We can't do that because brace protection around single accented chars is sometimes there explicitly in order to protect capitalisation ... I would consider the first two cases as examples of this and there were some examples of this from auto-generated files which gave rise to this behaviour requirement in biber. We preserve extra braces just in case they are there on purpose ...

moewew commented 4 years ago

I don't think we need to preserve the braces. BibTeX does not treat them as protecting braces as the following MWE shows

\documentclass[british]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{babel}
\usepackage{csquotes}

\begin{filecontents}[force]{\jobname.bib}
@article{lorem,
  title   = {Lorem and {{\v C}aplar}, P. and {{\v{C}}aplar}, R.
             and {\v{C}}aplar, S. and {\v C}aplar, T.
             and {{\v C}aplar}, U. and {{\v{C}}aplar}, V.},
  author  = {Anne Uthor},
  journal = {Journal},
  year    = {1995},
  volume  = {352},
  number  = {4},
}
\end{filecontents}

\begin{document}
\cite{lorem}
\bibliographystyle{plain}
\bibliography{\jobname}
\end{document}
\begin{thebibliography}{1}

\bibitem{lorem}
Anne Uthor.
\newblock Lorem and {{\v C}aplar}, p. and {{\v{C}}aplar}, r. and
  {\v{c}}aplar, s. and {\v c}aplar, t. and {{\v C}aplar}, u. and
  {{\v{C}}aplar}, v.
\newblock {\em Journal}, 352(4), 1995.

\end{thebibliography}

So I don't think we need to keep the braces around as protecting braces.

plk commented 4 years ago

In names, perhaps but in titles for example, shouldn't we preserve braces like:

TITLE = {A Title with {\v{S}}omething {\'{I}}mportant}

decoding this to:

TITLE = {A Title with {Š}omething {Í}mportant}

rather than

TITLE = {A Title with Šomething Ímportant}

I know that it's better to protect the whole word anyway but there are lots of auto-generated .bibs out there that protect only the first character of some words.

Currently, biber doesn't decode differently depending on the field but this could be done.

moewew commented 4 years ago

Oh, I definitely don't want to have different encodings for different fields.

The example above was for titles and demonstrates that for BibTeX {\v{C}}aplar and {\v C}aplar both do not apply case protection. (I used the title field because plain applies case changing to the title field of @articles).

lujiajing1126 commented 4 years ago

Should be fixed in DEV.

It works perfectly for me! Thx.

But I think these examples should be used as test cases in order to run regression tests. Then it could be avoided in the future.

docum3nt commented 4 years ago

I somehow missed this issue and posted a MWE to c.t.t about the identical problem, I encountered it with a public .bib file (Nelson Beebe's sgml.bib) which I believe has been used and tested sufficiently over the years that it can be taken as correct (but you never know). I have read through this issue but I still can't see what is causing the error. The MWE is below.

Peter

bug.tex ============================================================= \documentclass{article} \usepackage[backend=biber,style=authoryear]{biblatex} \addbibresource{bug.bib} \begin{document} This \cite{FernandezRequejo:1997:APH} \printbibliography \end{document}

big.bib ============================================================= @Book{FernandezRequejo:1997:APH, author = "Antonio {{Fernandez Requejo}, tr}", title = "Aprende y practica {HTML} 3.2", publisher = pub-ANAYA-MULTIMEDIA, address = pub-ANAYA-MULTIMEDIA:adr, pages = 289, year = 1997, ISBN = "84-415-0179-3", LCCN = "????", bibdate = "Fri Sep 11 08:29:11 MDT 1998", series = "Libro PC Magazine", acknowledgement = ack-nhfb, annote = "Titulo original: QuickStart HTML 3.2 for the Internet and Intranets.", keywords = "HTML (Lenguaje de programacion); Programacion (Computadoras electronicas) -- Lenguajes; Web, pagina -- Diseno", }

bug.bbl ============================================================== % $ biblatex auxiliary file $ % $ biblatex bbl format version 3.1 $ % Do not modify the above lines! % % This is an auxiliary file used by the 'biblatex' package. % This file may safely be deleted. It will be recreated by % biber as required. % \begingroup \makeatletter \@ifundefined{ver@biblatex.sty} {\@latex@error {Missing 'biblatex' package} {The bibliography requires the 'biblatex' package.} \aftergroup\endinput} {} \endgroup

\refsection{0} \datalist[entry]{nyt/global//global/global} \entry{FernandezRequejo:1997:APH}{book}{} \name{author}{1}{}{% {{un=0,uniquepart=base,hash=6269be744c84205c4caa2fe38c268f85}{% family={{{Fernandez Requejo}, tr}}, familyi={{\bibinitperiod}, given={Antonio}, giveni={A\bibinitperiod}, givenun=0}}% } % <-- there should be an additional closing curly brace here \strng{namehash}{6269be744c84205c4caa2fe38c268f85} \strng{fullhash}{6269be744c84205c4caa2fe38c268f85} \strng{bibnamehash}{6269be744c84205c4caa2fe38c268f85} \strng{authorbibnamehash}{6269be744c84205c4caa2fe38c268f85} \strng{authornamehash}{6269be744c84205c4caa2fe38c268f85} \strng{authorfullhash}{6269be744c84205c4caa2fe38c268f85} \field{sortinit}{F} \field{sortinithash}{fb0c0faa89eb6abae8213bf60e6799ea} \field{extradatescope}{labelyear} \field{labeldatesource}{year} \field{labelnamesource}{author} \field{labeltitlesource}{title} \field{annotation}{Titulo original: QuickStart HTML 3.2 for the Internet and Intranets.} \field{isbn}{84-415-0179-3} \field{series}{Libro PC Magazine} \field{title}{Aprende y practica {HTML} 3.2} \field{year}{1997} \field{pages}{289} \range{pages}{1} \keyw{HTML (Lenguaje de programacion); Programacion (Computadoras electronicas) -- Lenguajes; Web,pagina -- Diseno} \endentry \enddatalist \endrefsection \endinput

moewew commented 4 years ago

For easier testing here is a self-contained MWE

\documentclass[british]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[style=authoryear, backend=biber]{biblatex}

\begin{filecontents}{\jobname.bib}
@book{FernandezRequejo:1997:APH,
  author    = {Antonio {{Fernandez Requejo}, tr}},
  title     = {Aprende y practica {HTML} 3.2},
  publisher = {Anaya Multimedia},
  address   = {Madrid},
  pages     = 289,
  year      = 1997,
  ISBN      = {84-415-0179-3},
  series    = {Libro PC Magazine},
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
\cite{FernandezRequejo:1997:APH}
\printbibliography
\end{document}

With Biber 2.14 the MWE indeed produces the error you describe. With Biber 2.15 (dev) there is no such error any more. So the immediate error will be resolved in the next Biber release.

But I don't think

  author    = {Antonio {{Fernandez Requejo}, tr}},

is good input. Indeed with Biber 2.15 (dev) the MWE produces

Fernandez Requejo, tr, Antonio (1997).

Fernandez Requejo, tr

which is probably not what one would want to see.

I'm guessing the tr was added to show that Antonio Fernandez Requejo is the translator. Then the following would be a better biblatex entry that does not need excessive braces and compiles with Biber 2.14. The option usetranslator=true, allows translator to appear in the author position before the title.

\documentclass[british]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[style=authoryear, backend=biber]{biblatex}

\begin{filecontents}{\jobname.bib}
@book{FernandezRequejo:1997:APH,
  translator = {Fernandez Requejo, Antonio},
  options    = {usetranslator=true},
  title      = {Aprende y practica {HTML} 3.2},
  publisher  = {Anaya Multimedia},
  address    = {Madrid},
  pages      = 289,
  year       = 1997,
  ISBN       = {84-415-0179-3},
  series     = {Libro PC Magazine},
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
\cite{FernandezRequejo:1997:APH}
\printbibliography
\end{document}

Fernandez Requejo, Antonio, trans. (1997)

Fernandez Requejo

docum3nt commented 4 years ago

On 19/05/2020 16:41, moewew wrote:

For easier testing here is a self-contained MWE [...] With Biber 2.14 the MWE indeed produces the error you describe. With Biber 2.15 (dev) there is no such error any more. So the immediate error will

Thank you!

But I don't think

author = {Antonio {{Fernandez Requejo}, tr}},

is good input.

No, it's not, but it's from Nelson Beebe's sgml.bib, compiled in the days of old BiBTeX. I was just using it as an example in a new class package called bookshelf...out soon.

Thanks again for the help.

Peter

plk commented 4 years ago

I understand the problem but it gets increasingly hard to maintain backwards compatibility with edge, semi-broken input ...