Sleep only for as long as needed when reaching rate limit

fschaffner commented 3 years ago

Hello everyone, thanks for creating and maintaining this package!

I noticed that the make_query() function sleeps for 15 minutes when the rate limit is reached (429 error). However, often it is not necessary to sleep a full 15 minutes because some time has already passed between when the API request started and the rate limit is reached. For example, if we start a query and it takes 8 minutes until the rate limit is reached, then we only need to sleep 7 minutes until the rate limit resets, not a full 15 minutes. This can make a huge difference when collecting large amounts of data.

So I propose some improvements to this part of the function definition of make_query():

    if (status_code == 429) {
      .vcat(verbose, "Rate limit reached, sleeping... \n")
      count <- count + 1
      Sys.sleep(900)
    }

Instead of Sys.sleep(900) we could check when the rate limit resets and calculate the exact sleep time. The rtweet package has already implemented a set of functions to check when the rate limit resets, perhaps we could use that code as inspiration (see: https://docs.ropensci.org/rtweet/reference/rate_limit.html and https://github.com/ropensci/rtweet/blob/master/R/rate_limit.R).

That would also allow for a more informative message, such as: "Rate limit reached. Rate limit will reset at 15:05:21. Sleeping for 8.5 minutes...".

chainsawriot commented 3 years ago

It is possible to implement this by checking the

httr::headers(r)$`x-rate-limit-reset`

fschaffner commented 3 years ago

That would be great!

chainsawriot commented 3 years ago

@fschaffner The feature is now added. Thank you for your suggestion.

fschaffner commented 3 years ago

Thanks a lot for the quick implementation, it works great!

Just one small suggestion: Change the message to display the sleep time in minutes instead of seconds. So instead of ,"\nSleeping for", sleep_period ,"seconds. \n")", use "\nSleeping for", round(sleep_period / 60, digits = 2), "minutes. \n").

cjbarrie / academictwitteR

Sleep only for as long as needed when reaching rate limit #167