pkp / ots

PKP XML Parsing Service
GNU General Public License v3.0
32 stars 19 forks source link

BibTeX output is malformed #22

Closed crism closed 8 years ago

crism commented 9 years ago

Running the EEG Comic Sans document through the unit tests, progressively, produces BibTeX of this form:

@Book{RNiedermeyer,Silva,1993,
author="Niedermeyer, E.
and Silva, da Lopes F. H.",
title="Electroencephalography: Basic principles, clinical applications and related fields, 3rd edition",
year="1993",
publisher="Decker, Hamilton, Canada.",
address="Lippincott, Williams"
}

The ID must be a single token (see http://www.bibtex.org/Format/); Pandoc later chokes on this.

axfelix commented 9 years ago

Ah, this is what you meant about IDs blowing up. You're right -- that shouldn't be happening. We're definitely getting valid pandoc output on most of the documents I've tested but it's very possible that it's been blowing up in this way on some of them.

kaschioudi commented 8 years ago

@axfelix: Could you please point me to a document that we know is failing so that I can use it to test this issue? thanks.

axfelix commented 8 years ago

eeg_comicsans.docx

yup, sorry for the delay -- that output was reported from this document in our old smaller corpus.

kaschioudi commented 8 years ago

@crism : what do you mean by "Pandoc later chokes on this.". My patch transforms "RNiedermeyer,Silva,1993" to "RNiedermeyerSilva1993" and everything else, including Pandoc, seems to work fine.

axfelix commented 8 years ago

Resolved along with #42