celsobarreto / java-bibtex

Automatically exported from code.google.com/p/java-bibtex
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Encoding user text strings to LaTeX strings #3

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
There should be an API for encoding user text strings to LaTeX strings. The 
encoding consists of two tasks:
1) Escaping BibTeX and LaTeX special characters such as '{', '#', '&' etc. For 
example, the publisher name "Taylor and Francis" should become "Taylor \& 
Francis".
2) Replacing non US-ASCII character sets characters with the corresponding 
LaTeX commands. For example, the greek small letter alpha should be replaced 
with "\alpha".

It must be noted that the character replacement step is mandatory when the API 
user wants to have its output in the US-ASCII character encoding. It is 
optional when the output is in the Unicode character encoding.

The inverse operation (ie. decoding LaTeX strings to user text strings) is 
currently implemented in the LaTeXPrinter class. It would be probably wise to 
rename this class, so that there could be a pair of LaTeXEncoder and 
LaTeXDecoder classes.

Original issue reported on code.google.com by villu.ru...@gmail.com on 15 May 2012 at 9:38

GoogleCodeExporter commented 9 years ago
I am trying to parse the following BibTeX with java-bibtex 1.0.3 on Java 7.

@article { boverhof2008,
title = {Synthesis and characterization of some diorganotin(IV) complexes of 
Schiff bases derived from a non-protein amino acid. Crystal structures of (HO 
2CC6H4[N=C(H)KC(CH3)CH(CH 3)-3-OH]-p) and its di-n-butyltin(IV) complex (nBu 
2Sn\{O2CC6H4[N=C(H)\}\{C(CH 3)CH(CH3)...},
journal = {Applied Organometallic Chemistry},
year = {2008},
volume = {22},
number = {2},
pages = {114-121},
author = {Basu Baul, T.S. and Masharing, C. and Basu, S. and Pettinari, C. and 
Rivarola, E. and Chantrapromma, S. and Fun, H.-K.}
}

However, I get the following error.

Exception in thread "main" org.jbibtex.ParseException: Encountered "<EOF>" at 
line 9, column 3.
Was expecting one of:
    "," ...
    "#" ...
    "}" ...
    "," ...

    at org.jbibtex.BibTeXParser.generateParseException(BibTeXParser.java:926)
    at org.jbibtex.BibTeXParser.jj_consume_token(BibTeXParser.java:811)
    at org.jbibtex.BibTeXParser.Entry(BibTeXParser.java:366)
    at org.jbibtex.BibTeXParser.Object(BibTeXParser.java:201)
    at org.jbibtex.BibTeXParser.Database(BibTeXParser.java:176)
    at org.jbibtex.BibTeXParser.parse(BibTeXParser.java:21)
    at org.orcid.utils.BibtexUtils.getBibTeXDatabase(BibtexUtils.java:167)
    at org.orcid.utils.BibtexUtils.getBibTeXEntries(BibtexUtils.java:66)
    at org.orcid.utils.BibtexUtils.toCitation(BibtexUtils.java:108)
    at org.orcid.core.cli.ValidateBibTex.execute(ValidateBibTex.java:82)
    at org.orcid.core.cli.ValidateBibTex.main(ValidateBibTex.java:64)

It looks like this is because java-bibtex does not understand \{ to escape the 
curly braces. Then it sees unbalanced curly braces, and throws the exception.

Original comment by wjrsimp...@gmail.com on 15 Apr 2013 at 5:37

GoogleCodeExporter commented 9 years ago
The comment #1 presents an interesting edge case (ie. unbalanced curly braces 
that are escaped using LaTeX syntax). It has been isolated to a new separate 
issue (issue 10). 

Original comment by villu.ru...@gmail.com on 15 Apr 2013 at 6:34