Open Robinlovelace opened 4 years ago
As a follow-on point, I've just tested out parsing files with the bib2df
package and it seems fast.
Timings below on 2000+ .bib file FYI.
system.time({b = bib2df::bib2df("allrefs.bib")})
Some BibTeX entries may have been dropped.
The result could be malformed.
Review the .bib file and make sure every single entry starts
with a '@'.
Column `YEAR` contains character strings.
No coercion to numeric applied.
user system elapsed
2.098 0.003 2.112
Warning message:
In bib2df_tidy(bib, separate_names) : NAs introduced by coercion
> nrow(b)
[1] 2755
> system.time({b2 = citr:::read_bib_catch_error("allrefs.bib")})
<simpleError in RefManageR::ReadBib(x, check = FALSE, .Encoding = encoding): argument "encoding" is missing, with no default>
user system elapsed
0.108 0.000 0.108
> system.time({b2 = citr:::read_bib_catch_error("~/uaf/allrefs.bib", )})
x= encoding=
> system.time({b2 = citr:::read_bib_catch_error("~/uaf/allrefs.bib", "UTF-8")})
user system elapsed
7.179 0.093 7.272
Update: FYI I think the output from that package is not production ready yet. Just food for thought...
Hi Robin, thanks for sharing your results. This is actually one of the top two issues I want to tackle next. This looks promising.
Here are some of my thoughts on this. I think there are two major options here to speed up reading from Zotero:
bib2df
and using future
or promises
to enable loading the database in the background.Have you, by chance, looked at promises
? They seem to be an alternative to future
, but I haven't fully understood the strengths of each approach to decided which way to go on this. bib2df
also looks like a promising alternative to RefManageR
and bibtex
!
pandoc-zotxt
Lua filter with their R Markdown document format (e.g., using rmdfiltr
). However, if I understand correctly, this would require installation of zotxt
, another Zotero plugin.I haven't tried zotxt
and pandoc-zotxt
, but if the bibliography export is fast(er than BBT), this could be the easiest and fastest way to address slow loading of the Zotero database. Hence, I'm leaning towards the second option. This would require some testing and some user interface considerations (would this be a separate addin or could it be integrated with the existing one?).
Just to link to the previous issue on background loading of the Zotero database: https://github.com/crsh/citr/issues/36
Not tried promises, in my experience bib2df
is buggy. All approaches sound good, I'm excited for this new behaviour and happy to test anything you come up with. Many thanks.
After playing around with pandoc-zotxt
a little I've come to understand that it requires the global pandoc
variable PANDOC_STATE
, which was introduced in pandoc
2.4. Currently, RStudio is shipping version 2.3.1, so I'll wait until they ship a newer version before starting to implement and test this.
It's frustrating when
citr
freezes your session so I thought I'd have a play with thefuture
package. Results seem promising so far, so thought I'd report back, having alluded to the potential utility of having the initial bib read running in the background several months ago. Basic concept demonstrated in reprex below. Thoughts: welcome!Created on 2019-10-16 by the reprex package (v0.3.0)