MaelKubli / RTwitterV2

R functions for Twitter's v2 API
MIT License
46 stars 6 forks source link

Error 429 when getting tweets w/ full_archive_search #6

Open edramagor opened 2 years ago

edramagor commented 2 years ago

Hi, I found your R library a few days ago and I'm trying to get which can be considered a big amount of tweets trough the academic V2 API but I noticed a problem which I'd like to expose here. I know this still is an experimental library so I have this into account.

I'm trying to get tweets for the 2017-09-01 to 2017-10-12 period, I tried both with the full_archive_search with the minimun parameters and with the function for extracting tweets by day and then saving them in a csv file but I'm keep getting the same Error 429.

`2017-09-20T00:00:01Z to 2017-09-21T00:00:01Z has been colleted! Something went wrong! Error in full_archive_search(token = Bearer_Token, search_query = query, :

Error: 429`

I had also this kind of message

`2017-09-19T00:00:01Z to 2017-09-20T00:00:01Z has been colleted! Something went wrong! Error in full_archive_search(token = Bearer_Token, search_query = query, :

Error: 429 In addition: Warning message: In max(data$created_at) : no non-missing arguments to max; returning -Inf`

Looking for a possible answer in several forums I noticed that in Python some libraries include a "wait on rate limit" function. I wonder if it is possible to include a function like this in the code (I am not myself a programmer and I apologize for only 'suggesting' insted of giving a proper solution for this issue) in order to elude the Error: 429.

Thanks for doing a great work. Greetings!

MaelKubli commented 2 years ago

Hi

Thanks for using this Package. I am aware of this particular problem. I have seen it a few times as well whenever I try and collect bigger chunks of data. Unfortunately it is not the rate limit as this is looked up at every call to the API and if it hits zero (for the next call) it will pause until the limit resets again.

I am still investigating what the cause is. All I know at the moment that sometimes this error occurs. What I have tried to narrow in on the Error is to divide the data collection into smaller chunks and loop through it. This let me to the conclusion that sometimes a call to the API just fails and returns this error over and over again for some Tweets that are somehow causing this error. I am still investigating why this occurs.

I will give an Update as soon as I know more.

MaelKubli commented 2 years ago

PS: Try Dividing the Data collection into smaller time-chunks and loop (for / while / apply) through it with a tryCatch(), which will be able to handle the error. This would maybe help you figure out at what date the Error occurs. If it occurs every single Time then something differently is not right but probably not with the package...

All the best, Maël

edramagor commented 2 years ago

@MaelKubli : Thanks for your answer, I will try with a loop, I was trying with day periodos and a limit for 20K tweets and, as far as I go, it's doing great, maybe the issue is due some kind of problem with the amount of tweets (?), idk, but thanks again, I'm looking forward for any future improvements in the package, it is realy great (above all, the posibility to save all possible columns in a csv!) Thanks a lot and keep it up!