ropensci / bib2df

Parse a BibTeX file to a tibble
https://docs.ropensci.org/bib2df
99 stars 22 forks source link

Problem parsing double quoted tokens #30

Closed SimonGreenhill closed 5 years ago

SimonGreenhill commented 5 years ago

In the following BibTeX file:

@phdthesis{Yang2011Lalo,
    author = {Yang, Cathryn},
    address = {Bundoora},
    language = {English},
    school = {La Trobe University},
    shorttitle = {Lalo regional varieties},
    title = {Lalo regional varieties: {Phylogeny}, dialectometry and sociolinguistics},
    type = {{PhD} dissertation},
    year = {2011}
}

... the field type gets parsed to PhD} dissertation (i.e. the first curly brace protecting the casing in 'PhD' gets eaten.

The culprit is this statement in bib_gather. I'm not quite sure what this regex is doing, so I don't want to fiddle with it to fix it.

ottlngr commented 5 years ago

Hi, thanks for your message.

I'll look into this, but it seems like this can be solved easily!

ottlngr commented 5 years ago

Solved by e64df0e8af04ea2b5c8e4efd1eb2f8734e932e8e

SimonGreenhill commented 5 years ago

great! thanks!

ottlngr commented 5 years ago

But be aware that the fix is currently only available in branch v1.1.2. To install this version of bib2df, run remotes::install_github("ropensci/bib2df@v1.1.2"). This version may not be stable yet, though.

agricolamz commented 4 years ago

Hi,

I think the problem is still remain when the bracket is in the end of the name field:

Here is a .bib file:

@Inproceedings{brugman04,
  Author = {Brugman, H. and Russel, A. and Nijmegen, X.},
  Booktitle = {LREC},
  Title = {Annotating {M}ulti-media/{M}ulti-modal {R}esources with {ELAN}},
  Year = {2004}
}

Here is the code:

bib2df("test.bib") %>% 
   unlist() %>% 
   na.omit() %>% 
   View()

Here is the result:

CATEGORY    INPROCEEDINGS
BIBTEXKEY   brugman04
AUTHOR1 Brugman, H.
AUTHOR2 Russel, A.
AUTHOR3 Nijmegen, X.
BOOKTITLE   LREC
TITLE   Annotating {M}ulti-media/{M}ulti-modal {R}esources with {ELAN
YEAR    2004

As you can see it should've been {ELAN}.

I'm using bib2df v. 1.1.1