ropensci-archive / rplos

:warning: ARCHIVED :warning: R client for the PLoS Journals API
Other
316 stars 107 forks source link

searchplos() returns different results when using internal pagination (limit>999) #121

Closed Bubblbu closed 6 years ago

Bubblbu commented 6 years ago

The same search returns different id values when limit > 999. Specifically, the returned DOIs are returned with appended references to sections in the paper. E.g. 10.1371/journal.pone.0030394/introduction, 10.1371/journal.pone.0030394/results_and_discussion, and the actual 10.1371/journal.pone.0030394.

Problem: The number of returned items is still the same as the found number of results without URL variations. ==> removing URL variations leads to missing articles in the final dataset

Examples:

No pagination:

  pub_dates = 'publication_date:[2016-01-01T00:00:00Z TO 2016-12-31T23:59:59Z]'
  journal = 'journal_key:PLoSONE'
  doc_type = 'doc_type:full'

  fl = 'id,publication_date,title,author'
  fq = list(journal, pub_dates, doc_type)

searchplos(q="*:*", fl=fl, fq=fq, limit=999)
# A tibble: 999 x 4
   id                           publication_date     author                                       title                                   
   <chr>                        <chr>                <chr>                                        <chr>                                   
 1 10.1371/journal.pone.0155491 2016-05-13T00:00:00Z Qian Yang,Yan Gu,Xuan Zhang,Jian-Mei Wang,Y… Uterine Expression of NDRG4 Is Induced …
 2 10.1371/journal.pone.0168631 2016-12-19T00:00:00Z Lei Deng,Wei Li,Xingming Yu,Chao Gong,Xueha… Correction: First Report of the Human-P…
 3 10.1371/journal.pone.0168627 2016-12-21T00:00:00Z Wen-Chan Huang,Hung-Lin Chen,Huan-Yuan Chen… Galectin-3 and Its Genetic Variation rs…
 4 10.1371/journal.pone.0155489 2016-05-12T00:00:00Z Yu-Mi Lee,Dong Eun Song,Tae Yong Kim,Tae-Yo… Risk Factors for Distant Metastasis in …
 5 10.1371/journal.pone.0168605 2016-12-21T00:00:00Z Zahirah Dhurmeea,Iker Zudaire,Emmanuel Chas… Reproductive Biology of Albacore Tuna (…
 6 10.1371/journal.pone.0168602 2016-12-16T00:00:00Z Elmarie Myburgh,Ryan Ritchie,Amy Goundry,Ke… Attempts to Image the Early Inflammator…
 7 10.1371/journal.pone.0168604 2016-12-15T00:00:00Z Ying Zhang,Tom Thomas,M J G Brussel,M F A M… Expanding Bicycle-Sharing Systems: Less…
 8 10.1371/journal.pone.0155469 2016-05-12T00:00:00Z Chia-Yi Tseng,Chin-Hung Lin,Lung-Yuan Wu,Jh… Potential Combinational Anti-Cancer The…
 9 10.1371/journal.pone.0168612 2016-12-16T00:00:00Z Igor Marchetti,Tom Loeys,Lauren B Alloy,Ern… Unveiling the Structure of Cognitive Vu…
10 10.1371/journal.pone.0155472 2016-05-12T00:00:00Z Chuanyu Yang,Charles A Powell,Yongping Duan… Deciphering the Bacterial Microbiome in…
# ... with 989 more rows

With pagination:

searchplos(q="*:*", fl=fl, fq=fq, limit=1000)
# A tibble: 1,000 x 4
   id                                                  publication_date     author                                                   title
   <chr>                                               <chr>                <chr>                                                    <chr>
 1 10.1371/journal.pone.0030394/introduction           2012-01-23T00:00:00Z Wei-Yao Wang,Tzong-Shi Chiueh,Jun-Ren Sun,Shin-Ming Tsa… NA   
 2 10.1371/journal.pone.0030394/results_and_discussion 2012-01-23T00:00:00Z Wei-Yao Wang,Tzong-Shi Chiueh,Jun-Ren Sun,Shin-Ming Tsa… NA   
 3 10.1371/journal.pone.0002157/materials_and_methods  2008-05-14T00:00:00Z Markus Pfenninger,Carsten Nowak                          NA   
 4 10.1371/journal.pone.0030394/supporting_information 2012-01-23T00:00:00Z Wei-Yao Wang,Tzong-Shi Chiueh,Jun-Ren Sun,Shin-Ming Tsa… NA   
 5 10.1371/journal.pone.0044137/materials_and_methods  2012-09-19T00:00:00Z Esmeralda Morillo,María Antonia Sánchez-Trujillo,José R… NA   
 6 10.1371/journal.pone.0113465/materials_and_methods  2014-12-17T00:00:00Z Patrick Durez,Pierre Vandepapeliere,Pedro Miranda,Antoa… NA   
 7 10.1371/journal.pone.0099112/introduction           2014-06-10T00:00:00Z Li Qi,Felix G Meinel,Chang Sheng Zhou,Yan E Zhao,U Jose… NA   
 8 10.1371/journal.pone.0099112/results_and_discussion 2014-06-10T00:00:00Z Li Qi,Felix G Meinel,Chang Sheng Zhou,Yan E Zhao,U Jose… NA   
 9 10.1371/journal.pone.0099112/materials_and_methods  2014-06-10T00:00:00Z Li Qi,Felix G Meinel,Chang Sheng Zhou,Yan E Zhao,U Jose… NA   
10 10.1371/journal.pone.0155488/title                  2016-05-20T00:00:00Z Jirayu Tanprasertsuk,Binxing Li,Paul S Bernstein,Rohini… NA   
# ... with 990 more rows
Session Info ```r Session info ------------------------------------------------------------------------------------------------------------------------------ setting value version R version 3.4.4 (2018-03-15) system x86_64, linux-gnu ui RStudio (1.1.453) language en collate en_US.UTF-8 tz America/Vancouver date 2018-07-17 Packages ---------------------------------------------------------------------------------------------------------------------------------- package * version date source assertthat 0.1 2013-12-06 CRAN (R 3.3.1) cli 1.0.0 2017-11-05 CRAN (R 3.4.4) colorspace 1.3-2 2016-12-14 CRAN (R 3.3.2) crayon 1.3.4 2017-09-16 CRAN (R 3.4.2) crul 0.5.2 2018-02-24 CRAN (R 3.4.4) curl 3.2 2018-03-28 CRAN (R 3.4.4) DBI 0.5-1 2016-09-10 CRAN (R 3.3.1) devtools 1.12.0 2016-12-05 CRAN (R 3.3.2) digest 0.6.12 2017-01-27 CRAN (R 3.3.2) dplyr 0.5.0 2016-06-24 CRAN (R 3.3.1) ggplot2 2.2.1 2016-12-30 CRAN (R 3.3.2) gtable 0.2.0 2016-02-26 CRAN (R 3.3.1) jsonlite 1.5 2017-06-01 cran (@1.5) lazyeval 0.2.0 2016-06-12 CRAN (R 3.3.1) lubridate 1.6.0 2016-09-13 CRAN (R 3.3.1) magrittr 1.5 2014-11-22 CRAN (R 3.3.1) memoise 1.0.0 2016-01-29 CRAN (R 3.3.2) munsell 0.4.3 2016-02-13 CRAN (R 3.3.1) pillar 1.2.2 2018-04-26 CRAN (R 3.4.4) plyr 1.8.4 2016-06-08 CRAN (R 3.3.1) R6 2.2.2 2017-06-17 cran (@2.2.2) Rcpp 0.12.16 2018-03-13 CRAN (R 3.4.4) reshape2 1.4.2 2016-10-22 CRAN (R 3.3.2) rlang 0.2.0 2018-02-20 CRAN (R 3.4.4) rplos * 0.8.0 2017-11-03 CRAN (R 3.4.4) rstudioapi 0.6 2016-06-27 CRAN (R 3.3.2) scales 0.4.1 2016-11-09 CRAN (R 3.3.2) solrium 1.0.0 2017-11-02 CRAN (R 3.4.4) stringi 1.1.2 2016-10-01 CRAN (R 3.3.2) stringr 1.2.0 2017-02-18 CRAN (R 3.3.2) tibble 1.4.2 2018-01-22 CRAN (R 3.4.4) triebeard 0.3.0 2016-08-04 cran (@0.3.0) urltools 1.6.0 2016-10-17 cran (@1.6.0) utf8 1.1.3 2018-01-03 CRAN (R 3.4.4) whisker 0.3-2 2013-04-28 CRAN (R 3.3.1) withr 2.1.2 2018-03-15 CRAN (R 3.4.4) xml2 1.1.1 2017-01-24 CRAN (R 3.3.2) yaml 2.1.19 2018-05-01 CRAN (R 3.4.4) ```
sckott commented 6 years ago

try after reinstalling. remotes::install_github("ropensci/rplos")

Bubblbu commented 6 years ago

Thanks! DOIs look fine now :)