ORCID / bibtexParseJs

A JavaScript library that parses BibTeX parser.
MIT License
107 stars 39 forks source link

Bibtex key parser error using toJSON method #33

Open diogomatheus opened 3 years ago

diogomatheus commented 3 years ago

Hi,

I found a scenario generated from scopus.com that is preventing to parser the bibtex using the toJSON method, resulting in error.

@ARTICLE{EuropeanCommission,T2019, author={European Commission, T}, title={The new SME definition: User guide and model declaration}, journal={Enterprise and Industry Publications}, year={2019}, note={cited By 20}, source={Scopus}, }

I believe that the problem is in the bibtex key "EuropeanCommission,T2019". Do you think it's valid to implement some kind of treatment for this scenario, mainly because it was found in a reliable source (scopus)?

https://bibtex.online/ is parsing successfully. It can be interesting to accept any type of character or simply cut the key when finding invalid characters.

TomDemeranville commented 3 years ago

This is really a problem with the Scopus data rather than the parser. The bibtex 'format' is not very well documented, but in this case it's very clear according to https://www.bibtex.com/g/bibtex-format/

The citekey can be any combination of alphanumeric characters including the characters "-", "_", and ":"

In fact I'm amazed https://bibtex.online/ works! This is kind of like including an unclosed quote or extra comma in a CSV file.

TomDemeranville commented 3 years ago

For info, google books creates this bibtex for the previous version of the document:

@book{european2005new, title={The New SME Definition: User Guide and Model Declaration}, author={European Commission and Europ{\"a}ische Kommission and European Commission. Directorate-General for Employment, Social Affairs and Equal Opportunities. Unit E.3}, isbn={9789289479097}, lccn={2005534119}, series={EUi collection}, url={https://books.google.co.uk/books?id=-T2QAAAAMAAJ}, year={2005}, publisher={Office for Official Publications of the European Communities} }

Which is very different!

diogomatheus commented 3 years ago

@TomDemeranville you're right, the root cause is the scopus data. However, treatment would prevent this kind of error, but I agree with you that it's not a parser problem.

Thanks for the answer.