Closed yonicd closed 4 years ago
can you give the file mrg.bib or at least some subset that makes it reproducible?
same thing happened for me with this example:
z <- system.file('extdata/crossref.bib', package = "handlr")
bibtex_reader(x = z)
but i updated to latest RefManageR
on github (remotes::install_github("ropensci/RefManageR")
) and now it works. let me know if that works.
it looks like ultimate cause may be in https://github.com/romainfrancois/bibtex/issues/16
new error after installing, which is what i got with subsets of the original bib file.
Error in do_read_bib(file, encoding = .Encoding, srcfile) :
lex fatal error:
fatal flex scanner internal error--end of buffer missed
> packageVersion('RefManageR')
[1] ‘1.2.8’`
i'm pretty sure that's a bibtex pkg problem
if you can share a reproducible example that will help narrow this down
as it is written in the RefManageR in reprex that problem disappears :)
here is a snippet of the bib file that is causing problems
@ARTICLE{Jia1996-yu,
title = "Errors in time in pharmacokinetic studies",
author = "Jia, X and Nedelman, J R",
journal = "J. Biopharm. Stat.",
volume = 6,
number = 3,
pages = "303--318",
year = 1996
}
@MISC{noauthor_2018-bk,
title = "{APO-HYDROmorphone} {CR}",
number = "Control No: 210830",
institution = "Apotex Inc",
month = may,
year = 2018,
howpublished = "Product Monograph"
}
@ARTICLE{Langley2014-ra,
title = "Secukinumab in plaque psoriasis--results of two phase 3 trials",
author = "Langley, Richard G and Elewski, Boni E and Lebwohl, Mark and
Reich, Kristian and Griffiths, Christopher E M and Papp, Kim and
Puig, Llu{\'\i}s and Nakagawa, Hidemi and Spelman, Lynda and
Sigurgeirsson, B{\'a}r{\dh}ur and Rivas, Enrique and Tsai,
Tsen-Fang and Wasel, Norman and Tyring, Stephen and Salko, Thomas
and Hampele, Isabelle and Notter, Marianne and Karpov, Alexander
and Helou, Silvia and Papavassilis, Charis and {ERASURE Study
Group} and {FIXTURE Study Group}",
abstract = "BACKGROUND: Interleukin-17A is considered to be central to the
pathogenesis of psoriasis. We evaluated secukinumab, a fully
human anti-interleukin-17A monoclonal antibody, in patients with
moderate-to-severe plaque psoriasis. METHODS: In two phase 3,
double-blind, 52-week trials, ERASURE (Efficacy of Response and
Safety of Two Fixed Secukinumab Regimens in Psoriasis) and
FIXTURE (Full Year Investigative Examination of Secukinumab vs.
Etanercept Using Two Dosing Regimens to Determine Efficacy in
Psoriasis), we randomly assigned 738 patients (in the ERASURE
study) and 1306 patients (in the FIXTURE study) to subcutaneous
secukinumab at a dose of 300 mg or 150 mg (administered once
weekly for 5 weeks, then every 4 weeks), placebo, or (in the
FIXTURE study only) etanercept at a dose of 50 mg (administered
twice weekly for 12 weeks, then once weekly). The objective of
each study was to show the superiority of secukinumab over
placebo at week 12 with respect to the proportion of patients who
had a reduction of 75\% or more from baseline in the psoriasis
area-and-severity index score (PASI 75) and a score of 0 (clear)
or 1 (almost clear) on a 5-point modified investigator's global
assessment (coprimary end points). RESULTS: The proportion of
patients who met the criterion for PASI 75 at week 12 was higher
with each secukinumab dose than with placebo or etanercept: in
the ERASURE study, the rates were 81.6\% with 300 mg of
secukinumab, 71.6\% with 150 mg of secukinumab, and 4.5\% with
placebo; in the FIXTURE study, the rates were 77.1\% with 300 mg
of secukinumab, 67.0\% with 150 mg of secukinumab, 44.0\% with
etanercept, and 4.9\% with placebo (P<0.001 for each secukinumab
dose vs. comparators). The proportion of patients with a response
of 0 or 1 on the modified investigator's global assessment at
week 12 was higher with each secukinumab dose than with placebo
or etanercept: in the ERASURE study, the rates were 65.3\% with
300 mg of secukinumab, 51.2\% with 150 mg of secukinumab, and
2.4\% with placebo; in the FIXTURE study, the rates were 62.5\%
with 300 mg of secukinumab, 51.1\% with 150 mg of secukinumab,
27.2\% with etanercept, and 2.8\% with placebo (P<0.001 for each
secukinumab dose vs. comparators). The rates of infection were
higher with secukinumab than with placebo in both studies and
were similar to those with etanercept. CONCLUSIONS: Secukinumab
was effective for psoriasis in two randomized trials, validating
interleukin-17A as a therapeutic target. (Funded by Novartis
Pharmaceuticals; ERASURE and FIXTURE ClinicalTrials.gov numbers,
NCT01365455 and NCT01358578, respectively.).",
journal = "N. Engl. J. Med.",
volume = 371,
number = 4,
pages = "326--338",
month = jul,
year = 2014,
language = "en"
}
thanks.
that example works for me as well
now it works for me... when i do x$write('citeproc')
i get only the first element back. how do i get the nth one?
x$write('citeproc')
{
"type": "article-journal",
"id": {},
"categories": [],
"language": {},
"author": [
{
"type": "Person",
"family": "Jia",
"given": "X",
"literal": "Jia"
},
{
"type": "Person",
"family": "Nedelman",
"given": "J R",
"literal": "Nedelman"
}
],
"editor": [],
"issued": {
"date-parts": {}
},
"submitted": {
"date-parts": {}
},
"abstract": {},
"container-title": {},
"DOI": {},
"issue": {},
"page": "303318",
"publisher": {},
"title": "Errors in time in pharmacokinetic studies",
"URL": {},
"version": {},
"volume": "6"
}
btw the page doesn't look like it was parsed right
just back from vacation, will look at this tomorrow
@yonicd install from pluralize
branch install.packages("ropensci/handlr@pluralize")
- restart, then load the pkg again, then try that example again. That branch should make all formats handle 1 or many - some formats don't have a plural format really that I know of (e.g., RIS), so are written out to separate files if you write to file
that works. thanks. i'm seeing another problem. may be worth another issue. if i load a bigger bib file (initial comment in this issue) then something is cached and clogs the reader until i refresh the session and then all seems to work again.
> x <- handlr::HandlrClient$new(x = '~/Desktop/mrg.bib') #big bib
> x$read("bibtex")
Error in do_read_bib(file, encoding = .Encoding, srcfile) :
lex fatal error:
input buffer overflow, can't enlarge buffer because scanner uses REJECT
> x <- handlr::HandlrClient$new(x = '~/Desktop/test1.bib') #snippet of bib
> x$read("bibtex")
Error in do_read_bib(file, encoding = .Encoding, srcfile) :
lex fatal error:
fatal flex scanner internal error--end of buffer missed
Restarting R session...
> x <- handlr::HandlrClient$new(x = '~/Desktop/test1.bib') #snippet of bib
> x$read("bibtex")
> x$write('citeproc')
[
{
"type": "article-journal",
"id": {},
"categories": [],
"language": {},
"author": [
{
"type": "Person",
"family": "Jia",
"given": "X",
"literal": "Jia"
},
{
"type": "Person",
"family": "Nedelman",
"given": "J R",
"literal": "Nedelman"
}
],
"editor": [],
"issued": {
"date-parts": {}
},
"submitted": {
"date-parts": {}
},
"abstract": {},
"container-title": {},
"DOI": {},
"issue": {},
"page": "303318",
"publisher": {},
"title": "Errors in time in pharmacokinetic studies",
"URL": {},
"version": {},
"volume": "6"
},
{
"type": "misc",
"id": {},
"categories": [],
"language": {},
"author": [],
"editor": [],
"issued": {
"date-parts": {}
},
"submitted": {
"date-parts": {}
},
"abstract": {},
"container-title": {},
"DOI": {},
"issue": {},
"page": "",
"publisher": {},
"title": "{APO-HYDROmorphone} {CR}",
"URL": {},
"version": {},
"volume": {}
},
{
"type": "article-journal",
"id": {},
"categories": [],
"language": {},
"author": [
{
"type": "Person",
"family": "Langley",
"given": "Richard G",
"literal": "Langley"
},
{
"type": "Person",
"family": "Elewski",
"given": "Boni E",
"literal": "Elewski"
},
{
"type": "Person",
"family": "Lebwohl",
"given": "Mark",
"literal": "Lebwohl"
},
{
"type": "Person",
"family": "Reich",
"given": "Kristian",
"literal": "Reich"
},
{
"type": "Person",
"family": "Griffiths",
"given": "Christopher E M",
"literal": "Griffiths"
},
{
"type": "Person",
"family": "Papp",
"given": "Kim",
"literal": "Papp"
},
{
"type": "Person",
"family": "Puig",
"given": "Lluís",
"literal": "Puig"
},
{
"type": "Person",
"family": "Nakagawa",
"given": "Hidemi",
"literal": "Nakagawa"
},
{
"type": "Person",
"family": "Spelman",
"given": "Lynda",
"literal": "Spelman"
},
{
"type": "Person",
"family": "Sigurgeirsson",
"given": "Bárður",
"literal": "Sigurgeirsson"
},
{
"type": "Person",
"family": "Rivas",
"given": "Enrique",
"literal": "Rivas"
},
{
"type": "Person",
"family": "Tsai",
"given": "Tsen-Fang",
"literal": "Tsai"
},
{
"type": "Person",
"family": "Wasel",
"given": "Norman",
"literal": "Wasel"
},
{
"type": "Person",
"family": "Tyring",
"given": "Stephen",
"literal": "Tyring"
},
{
"type": "Person",
"family": "Salko",
"given": "Thomas",
"literal": "Salko"
},
{
"type": "Person",
"family": "Hampele",
"given": "Isabelle",
"literal": "Hampele"
},
{
"type": "Person",
"family": "Notter",
"given": "Marianne",
"literal": "Notter"
},
{
"type": "Person",
"family": "Karpov",
"given": "Alexander",
"literal": "Karpov"
},
{
"type": "Person",
"family": "Helou",
"given": "Silvia",
"literal": "Helou"
},
{
"type": "Person",
"family": "Papavassilis",
"given": "Charis",
"literal": "Papavassilis"
},
{
"type": "Person",
"family": "ERASURE Study Group",
"given": "",
"literal": "ERASURE Study Group"
},
{
"type": "Person",
"family": "FIXTURE Study Group",
"given": "",
"literal": "FIXTURE Study Group"
}
],
"editor": [],
"issued": {
"date-parts": {}
},
"submitted": {
"date-parts": {}
},
"abstract": "BACKGROUND: Interleukin-17A is considered to be central to the\n\tpathogenesis of psoriasis. We evaluated secukinumab, a fully\n\thuman anti-interleukin-17A monoclonal antibody, in patients with\n\tmoderate-to-severe plaque psoriasis. METHODS: In two phase 3,\n\tdouble-blind, 52-week trials, ERASURE (Efficacy of Response and\n\tSafety of Two Fixed Secukinumab Regimens in Psoriasis) and\n\tFIXTURE (Full Year Investigative Examination of Secukinumab vs.\n\tEtanercept Using Two Dosing Regimens to Determine Efficacy in\n\tPsoriasis), we randomly assigned 738 patients (in the ERASURE\n\tstudy) and 1306 patients (in the FIXTURE study) to subcutaneous\n\tsecukinumab at a dose of 300 mg or 150 mg (administered once\n\tweekly for 5 weeks, then every 4 weeks), placebo, or (in the\n\tFIXTURE study only) etanercept at a dose of 50 mg (administered\n\ttwice weekly for 12 weeks, then once weekly). The objective of\n\teach study was to show the superiority of secukinumab over\n\tplacebo at week 12 with respect to the proportion of patients who\n\thad a reduction of 75\\% or more from baseline in the psoriasis\n\tarea-and-severity index score (PASI 75) and a score of 0 (clear)\n\tor 1 (almost clear) on a 5-point modified investigator's global\n\tassessment (coprimary end points). RESULTS: The proportion of\n\tpatients who met the criterion for PASI 75 at week 12 was higher\n\twith each secukinumab dose than with placebo or etanercept: in\n\tthe ERASURE study, the rates were 81.6\\% with 300 mg of\n\tsecukinumab, 71.6\\% with 150 mg of secukinumab, and 4.5\\% with\n\tplacebo; in the FIXTURE study, the rates were 77.1\\% with 300 mg\n\tof secukinumab, 67.0\\% with 150 mg of secukinumab, 44.0\\% with\n\tetanercept, and 4.9\\% with placebo (P<0.001 for each secukinumab\n\tdose vs. comparators). The proportion of patients with a response\n\tof 0 or 1 on the modified investigator's global assessment at\n\tweek 12 was higher with each secukinumab dose than with placebo\n\tor etanercept: in the ERASURE study, the rates were 65.3\\% with\n\t300 mg of secukinumab, 51.2\\% with 150 mg of secukinumab, and\n\t2.4\\% with placebo; in the FIXTURE study, the rates were 62.5\\%\n\twith 300 mg of secukinumab, 51.1\\% with 150 mg of secukinumab,\n\t27.2\\% with etanercept, and 2.8\\% with placebo (P<0.001 for each\n\tsecukinumab dose vs. comparators). The rates of infection were\n\thigher with secukinumab than with placebo in both studies and\n\twere similar to those with etanercept. CONCLUSIONS: Secukinumab\n\twas effective for psoriasis in two randomized trials, validating\n\tinterleukin-17A as a therapeutic target. (Funded by Novartis\n\tPharmaceuticals; ERASURE and FIXTURE ClinicalTrials.gov numbers,\n\tNCT01365455 and NCT01358578, respectively.).",
"container-title": {},
"DOI": {},
"issue": {},
"page": "326338",
"publisher": {},
"title": "Secukinumab in plaque psoriasis--results of two phase 3 trials",
"URL": {},
"version": {},
"volume": "371"
}
]
thanks - i'll see if i can replicate the problem, if you can't share the bib file, can you at least say how many lines or how many citations are in the file
6430 citations 142751 lines
thanks
contacted bibtex author, hopefully he'll get back soon if there's a fix for bibtex error
reporod. eg with large file:
~ 11K citations ~ 149K rows
x <- handlr::HandlrClient$new(x = '/Users/sckott/github/rosadmin/citations/citations.txt')
x$read("bibtex")
length(x$parsed)
#> [1] "11960"
z <- x$write('citeproc')
class(z)
#> [1] "json"
This works fine for me.
Cool. I’ll try to see where my bib file fails.
It was created by paperpile, so it could be on their end too...
Thanks for the follow up!
pulling dependency on dev RefManageR for now - falling back to CRAN version for the push of the first version of this pkg to CRAN - will bring back dev version of RefManageR after the push to cran
I'm getting a similar error message with a large BibTeX file. The file is 16.9 mb. The code I used is
x <- handlr::HandlrClient$new(x = 'data-raw/consumption_14_19.bib')
x$read("bibtex")
The error message is
Error in do_read_bib(file, encoding = .Encoding, srcfile) :
lex fatal error:
fatal flex scanner internal error--end of buffer missed
consumption_14_19.bib was originally a .ris file that I converted using BibDesk. When I tried to read the .ris file in handlr with the following code, I get the following error message
> x <- handlr::HandlrClient$new(x = 'data-raw/consumption_14_19.ris')
Error in private$guess_format(x) :
could not guess format for string; specify format
I'm using the GitHub version of handlr.
thx for the report @GeraldCNelson , i was going to ask if you have dev version of RegManageR
, from the other issue it appears you do have it. that fatal flex scanner error comes from bibtex, and it seems there's no fix in sight unfortunately. I'm still looking into ways to fix the issues with large bib files.
for the could not guess format
error, can you share at least a subset of that file so I can see what th issue is
both RefManageR and bibtex pkgs have been updated. hopefully that sorts out the issue here.
Session Info
```r Session info -------------------------------------------------------------- setting value version R version 3.5.1 (2018-07-02) system x86_64, darwin15.6.0 ui RStudio (1.2.1162) language (EN) collate en_US.UTF-8 tz America/New_York date 2018-12-18 Packages ------------------------------------------------------------------ package * version date source base * 3.5.1 2018-07-05 local bibtex 0.4.2 2017-06-30 CRAN (R 3.5.0) compiler 3.5.1 2018-07-05 local crul 0.6.0 2018-07-10 CRAN (R 3.5.0) curl 3.2 2018-03-28 CRAN (R 3.5.0) datasets * 3.5.1 2018-07-05 local devtools 1.13.6 2018-06-27 CRAN (R 3.5.0) digest 0.6.18 2018-10-10 CRAN (R 3.5.0) graphics * 3.5.1 2018-07-05 local grDevices * 3.5.1 2018-07-05 local handlr * 0.0.4.9210 2018-12-19 Github (ropensci/handlr@0252efc) httpcode 0.2.0 2016-11-14 CRAN (R 3.5.0) httr 1.4.0 2018-12-11 CRAN (R 3.5.0) jsonlite 1.6 2018-12-07 CRAN (R 3.5.0) lubridate 1.7.4 2018-04-11 CRAN (R 3.5.0) magrittr 1.5 2014-11-22 CRAN (R 3.5.0) memoise 1.1.0 2017-04-21 CRAN (R 3.5.0) methods * 3.5.1 2018-07-05 local packrat 0.4.9-3 2018-06-01 CRAN (R 3.5.0) plyr 1.8.4 2016-06-08 CRAN (R 3.5.0) R6 2.3.0 2018-10-04 CRAN (R 3.5.0) Rcpp 1.0.0 2018-11-07 CRAN (R 3.5.0) RefManageR 1.2.0 2018-04-25 CRAN (R 3.5.0) stats * 3.5.1 2018-07-05 local stringi 1.2.4 2018-07-20 CRAN (R 3.5.0) stringr 1.3.1 2018-05-10 CRAN (R 3.5.0) tools 3.5.1 2018-07-05 local triebeard 0.3.0 2016-08-04 CRAN (R 3.5.0) urltools 1.7.1 2018-08-03 CRAN (R 3.5.0) utils * 3.5.1 2018-07-05 local withr 2.1.2 2018-03-15 CRAN (R 3.5.0) xml2 1.2.0 2018-01-24 CRAN (R 3.5.0) ```