ropensci / rtweet

🐦 R client for interacting with Twitter's [stream and REST] APIs
https://docs.ropensci.org/rtweet
Other
786 stars 201 forks source link

Error in search_users #37

Closed emillykkejensen closed 7 years ago

emillykkejensen commented 8 years ago

I think there is an error in parseing the results from search_users().

Here's what I trying:

usersearch <- search_users(q = "obama", n = 500, token = Token.Twitter)

And here is the error I'm getting:

Error in `[[<-.data.frame`(`*tmp*`, "screen_name", value = c("obama1_obama",  : 
  replacement has 20 rows, data has 19

The thing is, if I run it with parse set to FALSE, there is no problem getting the data - which leads me to think, that there is somthing going on in the parser function, that I can't seem to get!

emillykkejensen commented 8 years ago

Playing around I found, that the error only occurs when return_tweets is set to TRUE. If I run:

usersearch <- search_users(q = "obama", n = 500, token = Token.Twitter, parse = FALSE)
usr <- parser(usersearch, n = 500, return_tweets = FALSE, clean_tweets = FALSE, as_double = FALSE)

it works fine. So perhaps it is the parse_tweets() function?

mkearney commented 8 years ago

That is def not supposed to happen so good catch! I'm going to look into this today. Can I ask what version of rtweet you were using?

emillykkejensen commented 8 years ago

I'm using rtweet 0.3.6

emillykkejensen commented 7 years ago

The same problem occurs in lookup_users() by the way. (and I'm now using 0.3.7)

mkearney commented 7 years ago

Case in point for why I shouldn't rush to get the next CRAN update off my plate. I totally forgot about this problem until your comment this morning. It should be fixed now, but travis is down so I'll have to confirm the fix later today (hopefully).

b-rodrigues commented 7 years ago

Hi,

I'm having the same issue:

lux <- search_users(q = "luxembourg", n = 500)

here is the error message:

Searching for users... Error in [[<-.data.frame(*tmp*, "screen_name", value = c("DKinBelgium", : replacement has 20 rows, data has 19

As emillykkejensen, with parse=FALSE there is no error, but the error then appears when using parse().

Here's my sessionInfo():

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

 locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_LU.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_LU.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_LU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
 [11] LC_MEASUREMENT=de_LU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] rtweet_0.3.7

loaded via a namespace (and not attached):
[1] httr_1.2.1    R6_2.2.0      tools_3.3.2   yaml_2.1.14  
[5] curl_2.3      Rcpp_0.12.8   knitr_1.15.1  jsonlite_1.2 
[9] httpuv_1.3.3  openssl_0.9.5`
mkearney commented 7 years ago

Thank you for reporting this. This should be fixed in most recent version, which I'll submit to CRAN hopefully today. Can you install the newest version on Github and try? I'd love to make sure it's working before submitting to CRAN :).

devtools::install_github("mkearney/rtweet")
library(rtweet)
lux <- search_users(q = "luxembourg", n = 500)
b-rodrigues commented 7 years ago

Great! It works now, I could download 1000 tweets without any issue!

mkearney commented 7 years ago

Huzzah! There's a new feature for search_tweets as well. If you check the documentation for the argument "retryonratelimit" users can now specify much larger numbers (e.g., 50000 or even 2000000000) and the function will take care of rate limits and iterating through results. Though, it takes 15 mins for every 18,000 requested, so it may take a few hours for big searches :).

DonBunk commented 5 years ago

This issue seems to persist in some form:

search_users(q = 'cherd', n = 400)

yields:

Error in `[[<-.data.frame`(`*tmp*`, i, value = c(905650262, 2735261081,  : 
  replacement has 400 rows, data has 20

Note that setting parse = FALSE there is no issue. So apparently the issue is with the tweets_with_users() function that combines the resulting data.frames from the individual searches.

Note that the following causes NO issue:

search_users(q = 'cherd', n = 100)

'cherd' is nothing special or profane, it came up as I was searching for all the terms I found in a group of users' profiles.