sciunto-org / python-bibtexparser

Bibtex parser for Python 3
https://bibtexparser.readthedocs.io
MIT License
476 stars 132 forks source link

Year showing in bibtexparser 0.5.5 but not 0.6 #45

Closed bamos closed 8 years ago

bamos commented 10 years ago

Hi, in this bibtex file, bibtexparser 0.5.5 can correctly read the year from amos2013applying, but bibtexparser 0.6.0 doesn't show a year.

I'm using the following as a short example showing this between the 2 versions. Can you take a look at this when you get a chance?

Regards, Brandon.


#!/usr/bin/env python3

from bibtexparser.customization import *
from bibtexparser.bparser import BibTexParser

with open('publications.bib', 'r') as f:
  p = BibTexParser(f.read(), author).get_entry_list()
[print(x) for x in p]

0.5.5 Output

{'booktitle': "IWCMC'13 Security, Trust and Privacy Symposium", 'title': 'Applying machine learning classifiers to dynamic Android\nmalware detection at scale', 'id': 'amos2013applying', 'type': 'inproceedings', 'year': '2013', 'author': ['Amos, Brandon', 'Turner, Hamilton', 'White, Jules']}

0.6 Output

{'type': 'inproceedings', 'author': ['Amos, Brandon', 'Turner, Hamilton', 'White, Jules'], 'id': 'amos2013applying', 'title': 'Applying machine learning classifiers to dynamic Android\nmalware detection at scale', 'booktitle': "IWCMC'13 Security, Trust and Privacy Symposium"}
gpoo commented 10 years ago

It is not only the year, it can be any last field without a coma at the end. That coma is optional in BibTeX, but somehow the version 0.6 is not considering as optional.

gpoo commented 10 years ago

The regression was introduced in commit https://github.com/sciunto-org/python-bibtexparser/commit/19bca81c1ca5c13eda70c96247918777fc93077a

According to BibTex summary:

two fields must be separated by a coma, but the coma after the last field of an entry is optional;

Easier to notice if year is the last field, and without curly brackets (which are not mandatory for this field).

gpoo commented 10 years ago

Meh... it seems I confused two different issues. Sorry about the noise. In the case of this issue, the regression seems to be introduced in commit b2d022b0

gpoo commented 10 years ago

The problem is in the line 46.

@inproceedings{amos2013applying,
title={Applying machine learning classifiers to dynamic Android
malware detection at scale},
author={Amos, Brandon and Turner, Hamilton and White, Jules},
booktitle={IWCMC'13 Security, Trust and Privacy Symposium},
year={2013}
}
% Articles.
@article{amos2014QNSTOP,
title={{QNSTOP-QuasiNewton Algorithm for Stochastic Optimization}},
author={Brandon Amos and David Easterling and Layne Watson and
William Thacker and Brent Castle and Michael Trosset},
journal={},
year={submitted},
keywords={journal}
}

If you remove %Articles., then the year is printed correctly. Comments should be written as:

@comment{Articles.}
sciunto commented 10 years ago

Many thanks to both of you. I'll try to have a look asap.

bamos commented 10 years ago

Hi, thanks @gpoo for noticing I had incorrect comments in my BibTeX file. I've corrected these and bibtexparser 0.6 is working well now.

gpoo commented 10 years ago

Nevertheless, the parser can ignore them or send a warning. If BibTex compiles, then a parser could honor that.

sciunto commented 10 years ago

I agree with gpoo. The commit that fixed bamos' bibtex https://github.com/bamos/cv/commit/b0bd6b5852e585f631ad972bad5dcff08fb9997c#diff-6a584f12d8a9d2773171142f50537bb3

gpoo commented 10 years ago

Still the parser could ignore it, or complain that something is wrong with the formatting, instead of skipping one entry. The entries themselves are syntactically correct, it is just garbage in between.

sciunto commented 8 years ago

TODO: check is solved by #64

sciunto commented 8 years ago

Fixed by #64