Closed ataiprojects closed 6 years ago
can you please give more details. what is your sessionInfo()
, and what function(s) are you talking about
Sorry, I've thought next-cursor is only ever used with deep paging, so it won't be ambiguous. Details:
res1 = cr_works(query="ecology")
res1$meta total_results search_terms start_index items_per_page 1 320673 ecology 0 20
This tells me there are 320+ thousand works on ecology. Let's say I would like to gather metadata on the first 12 thousand of those and analyse it.
res2 = cr_works(query="ecology", cursor = "*", cursor_max = 1000)
would get me the first 1 thousand. To look at the 2nd thousand I would need the next-cursor value to substitute the *, right? Where do I get it?
Also, I have tried larger cursor_max values: 1100 works, but with 2000 I get: Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle) : Timeout was reached: Connection timed out after 10000 milliseconds
sessionInfo() R version 3.4.3 (2017-11-30) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.3 LTS
Matrix products: default BLAS: /usr/lib/libblas/libblas.so.3.6.0 LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale: [1] C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] rcrossref_0.8.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.14 bindr_0.1 xml2_1.1.1 magrittr_1.5 xtable_1.8-2 R6_2.2.1 rlang_0.2.0
[8] bibtex_0.4.2 stringr_1.2.0 plyr_1.8.4 dplyr_0.7.4 tools_3.4.3 miniUI_0.1.1 htmltools_0.3.6
[15] assertthat_0.2.0 digest_0.6.12 tibble_1.3.3 bindrcpp_0.2 shiny_1.0.4 triebeard_0.3.0 curl_3.1
[22] crul_0.5.2 glue_1.1.1 mime_0.5 stringi_1.1.5 compiler_3.4.3 urltools_1.6.0 jsonlite_1.5
[29] httpuv_1.3.5 pkgconfig_2.0.1
Thank you.
The next-cursor
is only returned from the crossref api if you use the cursor parameter. So as you showed above with cursor = "*"
that uses deep paging through cursors. We do the paging then automatically, so you don't need to do it yourself.
here's an example to get the first 12K
res3 <- cr_works(query="ecology", cursor = "*", cursor_max = 12000L, limit = 1000L)
NROW(res3$data)
head(res3$data)
limit
to is 1000 - so above then we do 12 requests, of 1000 each?rcrossref-package
manual pageThanks! I'll continue testing later, and if there are timeout errors as you say, I'll post another issue.
I expected it to be in res$meta, but it's not there. I think it would be helpful to include more details on this in the documentation. Thank you!