ropensci / rtweet

🐦 R client for interacting with Twitter's [stream and REST] APIs
https://docs.ropensci.org/rtweet
Other
786 stars 201 forks source link

explanation on search_users #29

Closed etancoigne closed 8 years ago

etancoigne commented 8 years ago

Hi, thank you for this package! I played a little bit with it and have some difficulties understanding the function search_users.

When I try users <- search_users("citizenscience", n=1000, verbose = T) I get 132 results, where Twitter returns only 60 accounts (usually they display up to 90 accounts). I was therefore wondering whether the search was performed in the users' biographies or in their tweets?

Another problem relates to concatenated searches: users <- search_users("citsci", n=1000, verbose = T)returns 36 results (same number as in the Twitter website search interface) users <- search_users("\"citizen sciences\"", n=1000, verbose = T) returns 2 results (same number as in the Twitter website search interface) But the combined search search_users("\"citizen sciences\" OR citsci", n=1000, verbose = T) returns the error

Error in attr(d, "tweets") <- x[["tweets"]] : tentative de changer un attribut en NULL

Thank you very much for your insights on this.

mkearney commented 8 years ago

Fixed a typo that caused the loop of results to break early and made a few other minor adjustments. You should get more consistent results now. I won't push updates to CRAN for another couple of weeks, but you can use the Github version for now (and please let me know if you have any more issues!).

As for how search_users functions, Twitter's API doesn't support exact matches when searching users. According to Twitter's API documentation, search/users...

Provides a simple, relevance-based search interface to public user accounts on Twitter. 
Try querying by topical interest, full name, company name, location, or other criteria. 
Exact match searches are not supported.

Instead of "OR", then, I'd recommend running independent searches for each term.

etancoigne commented 8 years ago

Thank you very much for your quick reply and explanations. I'll try to use citizen%20science as suggested in their documentation.

Another problem nevertheless seems to occur since I isntalled the Github version: I cannot get more than 20 users now, whatever the request. users <- search_users("citizenscience", n=1000, verbose = T) now returns 20 users instead of the previous 130.

mkearney commented 8 years ago

Sorry, forgot to push commits. It should be updated now!

etancoigne commented 8 years ago

Thank you! The first try after the download was a failure (don't know why), but now it seems to work. > dim(search_users(q = "citizenscience", n=1000, verbose = T))[1] Searching for users... Erreur : rate limit exceeded. > dim(search_users(q = "citizenscience", n=10000, verbose = T))[1] Searching for users... Finished collecting users! [1] 132

I made a try with "citizen%20sciences" but it seems to return the same results as "citizen sciences".

Thanks again for your disponibility and work!

mkearney commented 8 years ago

I'm glad you got it working! And, yeah, the search query will be url encoded internally if not done manually. From what I can tell, the search users API really doesn't tolerate very much search complexity at all.

And I appreciate you using and reporting issues with rtweet!