Closed aw-bib closed 9 years ago
Well, this is what it looks like on my Mid-2011 MacBook Air, with a 1683-item biblatex file:
$ time -p pandoc -s -F pandoc-citeproc -o test.html << EOT
---
bibliography: test.bib
nocite: '@*'
...
EOT
real 58.08
user 56.97
sys 0.86
$
So this doesn’t look quite as bad as your report suggests.
Can you process your bib(la)tex files with latex/pdflatex/xelatex/… and bibtex/biber?
Any error messages with pandoc-citeproc -y yourfile.bib
?
Any error messages with biber --tool -V yourfile.bib
?
This sounds interesting indeed.
Can you process your bib(la)tex files with latex/pdflatex/xelatex/… and bibtex/biber?
Yes, in LaTeX everything compiles nicely and I get a bibliography as well.
Are there any known issues where pandoc-citeproc
is known to be a bit more picky than e.g. bibtex
?
I'll check the suggested tools tonight.
Any error messages with
biber --tool -V yourfile.bib
?
Fixed indeed an error with an invalid key.
As for pandoc-citeproc -y yourfile.bib
I see no error message as such. However, Debians version of pandoc (1.12) throws a
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
It does not allow for the RTS-commands, thus I tried the latest and greatest deb
from pandoc (1.15.1). This one starts running, eats some 9GB of RAM and sits there. Any ideas what may eat up the RAM? For me it sounds a bit like a parsing error, but as I've clue what to look for, not knowing what pandoc-citeproc
tries to accomplish, I lack the idea what to look for.
+++ Alexander Wagner [Nov 09 15 10:25 ]:
Any error messages with biber --tool -V yourfile.bib?
Fixed indeed an error with an invalid key.
As for pandoc-citeproc -y yourfile.bib I see no error message as such. However, Debians version of pandoc (1.12) throws a Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize -RTS' to increase it.
It does not allow for the RTS-commands, thus I tried the latest and greatest deb from pandoc (1.15.1). This one starts running, eats some 9GB of RAM and sits there. Any ideas what may eat up the RAM? For me it sounds a bit like a parsing error, but as I've clue what to look for, not knowing what pandoc-citeproc tries to accomplish, I lack the idea what to look for.
Can you upload your bibtex file somewhere so we can test?
Can you upload your bibtex file somewhere so we can test?
Sure. Feel free to fetch it from http://www.desy.de/~arwagner/pandoc-citeproc.bib
Delete CROSSREF = {Walden-2008},
from
@BOOK{Walden-2008,
CROSSREF = {Walden-2008},
EDITION = {1. publ.},
EDITOR = {Scott Walden},
ISBN = {9781405139243},
LOCATION = {Malden, MA},
PAGETOTAL = {XII, 325},
PPN_gvk = {566382393},
PUBLISHER = {Blackwell},
SERIES = {New directions in aesthetics},
SUBTITLE = {essays on the pencil of nature},
TITLE = {{P}hotography and philosophy},
YEAR = {2008},
}
… and try again.
Quite cleary something you should not have in your data. Not sure whether it’s possible (or worth trying) for pandoc-citeproc to catch this.
Ah! A loop, indeed. And of course you're right. How did you find it? I've some dealings with other peoples bibliographies and knowledge about "how to detect errors" come in handy.
@nickbart1980 you made my day. :)
300 pages later I can report a working conversion including all bibliographic references. And indeed there is no performance issue, it was indeed just the looping crossref
.
Maybe you can comment here on how to find such errors or how you did it.
No special tools, I’m afraid, just vgrep :-)
We should probably fix pandoc-citeproc so it doesn't go into an infinite loop even with a loopy bibtex file. So I'll reopen this as a reminder to do that.
I tried recently to convert a book typeset in LaTeX to docx using pandoc. Everything worked out nicely except the references from BibTeX. I was able to strip the original bibtex input from some 1500 references to the used ones, but still with ~400 references pandoc-citeproc seems not to come to an end in processing. Given the fact that it is not possible to further strip down the number of bibtex-entries it would be nice if there could be some other way to handle such bibliographies.
I could have went by with conversion on a chapter basis, but with an input of ~400 entries it even didn't come to an end for a chapter with only 25 references. (I stopped it after some 15min at 100% cpu.)
Besides theses and other scientific books,
pandoc
would also come in handy for the production of bibliographies in a number of formats. E.g. something along the line of\nocite{*} with a given
bibtex`-input. However, for annual reporting schemes one easily hits several hundreds of publications. #71 does not seem to gain enough here.I tried pandoc 1.15.1 on linux.