Problems parsing .bib from ORCID

ropensci / bib2df

Parse a BibTeX file to a tibble

https://docs.ropensci.org/bib2df

99 stars 22 forks source link

Problems parsing .bib from ORCID #21

Open gorkang opened 6 years ago

gorkang commented 6 years ago

When reading a .bib file exported from an ORCID profile (Export works), bib2df() have some problems parsing it.

The same file can be imported in zotero without problems.

See attached bib file: works_G.zip

gorkang commented 6 years ago

OK, still some issues, but I think I found a relatively simple solution:

This has the weird parting issues (new columns are created for some of the works)

  bib2df("DEV/Bibtex/works_G.bib")

screenshot from 2018-02-17 14-24-29

With this we correct the main issue

  temp = read_file("works_G.bib") %>% 
    gsub("@(.+?),", "\n@\\1,\n", .) %>%
    gsub("},", "},\n", .) 

  write_file(temp, "works_G_corrected.bib")

  bib2df("DEV/Bibtex/works_G_corrected.bib")

screenshot from 2018-02-17 14-35-14

Maybe it could be nice to integrate a cleaning up function to get rid of the most common issues?

HedvigS commented 5 years ago

I also had this problem with a file, am now going to do the gsub fix as well. If there are any news on any changes in the package I hope they'll be posted in this thread. Thanks @xrotwang for noticing this thread.

ottlngr commented 5 years ago

Hi, thanks for your messages, @gorkang and @HedvigS .

Indeed bib2df currently has problems parsing a file if the key-value pairs are not separated by linebreaks. I'm currently working on another package but soon will be working on bib2df again. Then I will change the whole mechanism to be indenpendent on linebreaks.

By then the way described by @gorkang is a good way to go.

HedvigS commented 5 years ago

Yep, I'm doing the gsub thing in the meantime and it's work.

Thanks again for a great package!