ambs / Text-BibTeX

Text::BibTeX
6 stars 9 forks source link

Is there any way to make the parentheses and space be contained in bibtex entrykey? #30

Open hushidong opened 5 years ago

hushidong commented 5 years ago

Is there any way to make the parentheses and space be contained in bibtex entrykey?

hello

I am a biblatex/biber user, and encountered an error while biber was parsing the bib file, which contained entries with entriekey containing () or ' ', the biber/biblatex author told me that it was caused by btparse(biber use btparse), so I am here to ask about some improvement or measure to overcome this problem.

for an entry:

@misc{Euclidean_geometry(hi),
howpublished = {https://zh.wikipedia.org/wiki/abc},
title = {Euclidean geometry},
}

the biber output is:

INFO - This is Biber 2.12
INFO - Logfile is 'egtest.blg'
INFO - Reading 'egtest.bcf'
INFO - Using all citekeys in bib section 0
INFO - Processing section 0
INFO - Looking for bibtex format file 'egtest.bib' for section 0
INFO - LaTeX decoding ...
INFO - Found BibTeX data source 'egtest.bib'
WARN - BibTeX subsystem: C:\Users\ADMINI~1\AppData\Local\Temp\CQM6OZvJS8\egtest.
bib_2912.utf8, line 6, warning: "(" in strange place -- should get a syntax erro
r
ERROR - BibTeX subsystem: C:\Users\ADMINI~1\AppData\Local\Temp\CQM6OZvJS8\egtest
.bib_2912.utf8, line 6, syntax error: found "(", expected ","
INFO - WARNINGS: 1
INFO - ERRORS: 1

and for an entry:

@misc{how are you,
howpublished = {{www.baidu.com}},
title = {how are you},
}

the biber output is:

INFO - This is Biber 2.12
INFO - Logfile is 'egtest4.blg'
INFO - Reading 'egtest4.bcf'
INFO - Using all citekeys in bib section 0
INFO - Processing section 0
INFO - Looking for bibtex format file 'egtest4.bib' for section 0
INFO - LaTeX decoding ...
INFO - Found BibTeX data source 'egtest4.bib'
ERROR - BibTeX subsystem: C:\Users\ADMINI~1\AppData\Local\Temp\Y796C4jmpi\egtest
4.bib_6436.utf8, line 11, syntax error: found "are", expected ","
INFO - ERRORS: 1
ambs commented 5 years ago

Hi. I would love to be able to support that kind of things. Unfortunately btparse was written in an ancient version of Antlr (pccts at the time), and its code generation is no longer supported. We have been changing manually some details of the parser, but that doesn't allow us to quickly do changes to the code.

So, while I will keep this ticket open, I do not have the time to dig on the source of the parser and try to change its behaviour. Thus, my suggestion would be to change the kind of used keys in your files. I do not see a great reason to have parenthesis or spaces on citation keys :smile:

plk commented 5 years ago

The problem as I remember is that the btparse parser allows normal parentheses to play the same role as curly braces and so there is no way parentheses can be in keys (just like the key can't contain curly braces). The only option is to change the parser token codes for parentheses but then this breaks the ability to use parentheses as braces, which some people might rely on. I agree that there is very little reason to use spaces/parenthesis in keys - it's very rare indeed and you will have trouble with many parsing libraries and tools anyway.

hushidong commented 5 years ago

ok,thanks, I will change the bib file which is not a hard work by using regular expressions.

zepinglee commented 1 year ago

https://github.com/ambs/Text-BibTeX/blob/ff35b673b730c8a12efa1e5dec35c15187dfdca5/btparse/doc/bt_language.pod#L141-L144

https://github.com/ambs/Text-BibTeX/blob/ff35b673b730c8a12efa1e5dec35c15187dfdca5/btparse/doc/bt_language.pod#L175-L177

Actually the pattern of NAME is not used for entry keys. The regex pattern for entry keys should be [^ ,\t\n]* for parenthesis-style entries (@entrytype(...)) or [^ ,}\t\n]* for brace-style entry (@entrytype{...}).

The relevant code is located at bibtex.web#L6152-L6175. This procedure calls scan1_white(comma) or scan2_white(comma,right_brace) but none of them involves id_class (defined in L877-L896) which contains allowed characters in current NAME .

Also note the PEG for BibTeX provided by https://github.com/aclements/biblib is worth of reference.