ropensci / openalexR

Getting bibliographic records from OpenAlex
https://docs.ropensci.org/openalexR/
Other
91 stars 20 forks source link

Timing Out #142

Closed JeffreySmithA closed 1 year ago

JeffreySmithA commented 1 year ago

I'm trying to get the citations for more than 10 years for all of the papers in my dataset (as asked before). I have 62085 papers in my dataset, but approximately 1/3 of the way in I'm receiving the following error:

Error in curl::curl_fetch_memory(url, handle = handle) : Timeout was reached: [api.openalex.org] Connection timed out after 10006 milliseconds

I've tried the following things:

  1. Removing the specific paper that triggers this error (unsuccessful)
  2. Subsetting the larger dataframe into smaller ones (also unsuccessful)

Are there any other solutions?

trangdata commented 1 year ago

Hi @JeffreySmithA could you elaborate on your attempts to resolve the issue?

  1. So have you identified the problematic paper?
  2. How are you chunking your core paper set? Let's say we separate the core set in chunks of 1,000 papers, do you still have this problem with all the chunks? Or some of the chunks?

And how are you getting the citations for over 10 years? Here is how I would do it:

library(openalexR)
my_papers <- paste0(
  "https://openalex.org/",
  c("W1519117689", "W2017292130", "W2941875476", "W4229010617", "W854896339")
)
citing_papers <- oa_fetch(
  "works",
  cites = my_papers,
  options = list(select = c("id", "publication_year", "referenced_works"))
)
filtered <- citing_papers |>
  tidyr::unnest(referenced_works) |>
  dplyr::filter(referenced_works %in% my_papers)

Created on 2023-07-31 with reprex v2.0.2

JeffreySmithA commented 1 year ago

Sorry, the issue was neither of the things I flagged. The free API allows for 100,000 queries a day! I reached the limit. Thanks for your help :)