zotero / translators

Zotero Translators
http://www.zotero.org/support/dev/translators
1.28k stars 758 forks source link

BibTeX: All entries skipped after unclosed `{` in field #3121

Open dstillman opened 1 year ago

dstillman commented 1 year ago

https://forums.zotero.org/discussion/107237/440-papers-imported-out-of-expected-2144-during-migration

The { after "Bayesian" is breaking this. I don't know enough about BibTeX escaping to know what's going on here, though Biber seems to be able to validate it without an error.

@ARTICLE{Efron1986-bo,
  title    = "{Why isn't everyone a {\textbackslashBayesian{\}}? With
              discussion and a reply by the author}",
  author   = "Efron, B",
  journal  = "The American statistician",
  volume   =  40,
  number   =  1,
  pages    = "1--11",
  year     =  1986,
  annote   = "Efron's attempt to answer why not everyone is a Bayesian, as of
              1985. Main reasons * non-Bayesian approaches easier to use * more
              computationally feasible * don't have to rely on controversial
              subjective
              priors\textbackslashtextlessm:linebreak\textbackslashtextgreater\textbackslashtextless/m:linebreak\textbackslashtextgreater
              \textbackslashtextlessm:linebreak\textbackslashtextgreater\textbackslashtextless/m:linebreak\textbackslashtextgreater",
  keywords = "bayesian,philosophy",
  issn     = "0003-1305"
}
yaroslavvb commented 1 year ago

This issue is blocking migration from Paperpile to Zotero

adam3smith commented 1 year ago

We can try to fix this, but these are unbalanced curly brackets (as one of the closing ones is escaped), which doesn't seem right to me

I would rather fix what looks like a pretty clearly broken bibtex file using search & replace (e.g. \} --> \textbackslash} though I don't understand what most of the markup is doing in the title at all.

dstillman commented 1 year ago

I've adjusted the title to better reflect what's happening.

I agree that the BibTeX is wrong (and confusing), but we could probably be a bit more defensive here against bad input. And I don't know if there's something in the BibTeX spec that should cause us to force-close the field rather than treating thousands of additional lines as being part of the field. (I didn't test various BibTeX processors to see how they handle it.)

yaroslavvb commented 1 year ago

I get this bibtex when clicking "Export" from my paperpile library, so the file is automatically generated. Maybe it's due to some formatting bugs leftover from when I transitioned to Paperpile from (now defunct) Citeulike

I could report this issue there as well, but I imagine Zotero would be more incentivized to make this work than Paperpile, since I'm stuck giving money to them until I can transition.

dstillman commented 1 year ago

To be clear, you shouldn't be waiting for us for anything — as I said in the forums, you just need to fix the few entries with these unclosed braces in a text editor. It will take 30 seconds. This ticket is just about making Zotero more forgiving of incorrect input in the future.

This is a bug in Paperpile, and you should report it to them if you care about their fixing it (and you don't even need to mention Zotero — the bug is the unescaped/unclosed brace), but that has nothing to do with your being able to import now.

yaroslavvb commented 1 year ago

In my library, the title appears as -- "Why isn't everyone as \Bayesian}? With discussion...", so the second curly brace } is intepreted as user text, not as a control character

dstillman commented 1 year ago

I mean you can also just remove the characters from the affected items and re-export — the title in your library should obviously just be "Why isn't everyone a Bayesian?". But you have to look in the BibTeX file to find the affected items anyway, so you might as well just fix them there. In terms of the bug, the point is just that, no matter what's in the field, Paperpile shouldn't be outputting BibTeX with an unescaped/unclosed brace.