cjbarrie / academictwitteR

Repo for academictwitteR package to query the Twitter Academic Research Product Track v2 API endpoint.
Other
272 stars 59 forks source link

Sleep only for as long as needed when reaching rate limit #167

Closed fschaffner closed 3 years ago

fschaffner commented 3 years ago

Hello everyone, thanks for creating and maintaining this package!

I noticed that the make_query() function sleeps for 15 minutes when the rate limit is reached (429 error). However, often it is not necessary to sleep a full 15 minutes because some time has already passed between when the API request started and the rate limit is reached. For example, if we start a query and it takes 8 minutes until the rate limit is reached, then we only need to sleep 7 minutes until the rate limit resets, not a full 15 minutes. This can make a huge difference when collecting large amounts of data.

So I propose some improvements to this part of the function definition of make_query():

    if (status_code == 429) {
      .vcat(verbose, "Rate limit reached, sleeping... \n")
      count <- count + 1
      Sys.sleep(900)
    }

Instead of Sys.sleep(900) we could check when the rate limit resets and calculate the exact sleep time. The rtweet package has already implemented a set of functions to check when the rate limit resets, perhaps we could use that code as inspiration (see: https://docs.ropensci.org/rtweet/reference/rate_limit.html and https://github.com/ropensci/rtweet/blob/master/R/rate_limit.R).

That would also allow for a more informative message, such as: "Rate limit reached. Rate limit will reset at 15:05:21. Sleeping for 8.5 minutes...".

chainsawriot commented 3 years ago

It is possible to implement this by checking the

httr::headers(r)$`x-rate-limit-reset`
fschaffner commented 3 years ago

That would be great!

chainsawriot commented 3 years ago

@fschaffner The feature is now added. Thank you for your suggestion.

fschaffner commented 3 years ago

Thanks a lot for the quick implementation, it works great!

Just one small suggestion: Change the message to display the sleep time in minutes instead of seconds. So instead of ,"\nSleeping for", sleep_period ,"seconds. \n")", use "\nSleeping for", round(sleep_period / 60, digits = 2), "minutes. \n").