ropensci / rtweet

🐦 R client for interacting with Twitter's [stream and REST] APIs
https://docs.ropensci.org/rtweet
Other
788 stars 200 forks source link

lookup_statuses get 400 malformed error #35

Closed peeyooshc closed 7 years ago

peeyooshc commented 7 years ago

running this demo script does not return a dataframe

statuses <- c("potus", "hillaryclinton", "realdonaldtrump", "fivethirtyeight", "cnn", "espn", "twitter")

twt_df <- lookup_statuses(statuses) twt_df

Went looking at the internals and it seems to think that TWIT is making a call but nothing is being returned. What could be happening?

I tried the same query in https://apigee.com/console/twitter and it also get a 200 OK response but no data.

mkearney commented 7 years ago

Oh this was a silly mistake on my part. The lookup_statuses function expects a status_id (tweet ids) vector. To lookup users, it should use the lookup_users() function.

> users <- c("potus", "hillaryclinton", "realdonaldtrump",
  "fivethirtyeight", "cnn", "espn", "twitter")
> twt_df <- lookup_users(users)
> twt_df
     user_id            name     screen_name          location
1 1536791610 President Obama           POTUS  Washington, D.C.
2 1339835893 Hillary Clinton  HillaryClinton      New York, NY
3   25073877 Donald J. Trump realDonaldTrump      New York, NY
4 2303751216 FiveThirtyEight FiveThirtyEight      New York, NY
5     759251             CNN             CNN              <NA>
6    2557521            ESPN            espn       Bristol, CT
7     783214         Twitter         twitter San Francisco, CA
...
mkearney commented 7 years ago

Thanks for pointing this out! I've updated the documentation (incoming commit), so I'll close this issue. The example provided for lookup_statuses should now look something like this:

> statuses <- c("567053242429734913", "266031293945503744", 
+   "440322224407314432")
> statuses <- lookup_statuses(statuses)
> statuses
           created_at    status_id
1 2015-02-15 20:10:02 5.670532e+17
2 2012-11-07 04:16:18 2.660313e+17
3 2014-03-03 03:06:13 4.403222e+17
                                                                                                                                      text
1 For every retweet this gets, Pedigree will donate one bowl of dog food to dogs in need! \U0001f60a #tweetforbowls http://t.co/z4rmc2HsGT
2                                                                                                    Four more years. http://t.co/bAJE6Vom
3                                                        If only Bradley's arm was longer. Best photo ever. #oscars http://t.co/C9U5NOtGap
               source in_reply_to_status_id in_reply_to_user_id in_reply_to_screen_name is_quote_status retweet_count favorite_count lang
1             Twuffer                    NA                  NA                      NA           FALSE        684247         135915   en
2  Twitter Web Client                    NA                  NA                      NA           FALSE        837868         466187   en
3 Twitter for Android                    NA                  NA                      NA           FALSE       3304788        2291832   en
   user_id  screen_name quoted_status_id mentions_user_id mentions_screen_name      hashtags urls is_retweet retweet_status_id place_name
1 14465607    AHMalcolm               NA               NA                   NA tweetforbowls   NA      FALSE                NA       <NA>
2   813286  BarackObama               NA               NA                   NA            NA   NA      FALSE                NA       <NA>
3 15846407 TheEllenShow               NA               NA                   NA        oscars   NA      FALSE                NA       <NA>
peeyooshc commented 7 years ago

Thanks Michael,

I apologise I wasn't clear in my request. I want to get the latest tweet from each of these users in the status vector.

Do I have to cycle through each user's timeline?

On 28/10/2016 09:04, "Michael W. Kearney" notifications@github.com wrote:

Oh this was a silly mistake on my part. That function expects a status_id (tweet ids) vector. To lookup those users, it should use the lookup_users() function.

users <- c("potus", "hillaryclinton", "realdonaldtrump", "fivethirtyeight", "cnn", "espn", "twitter")> twt_df <- lookup_users(users)> twt_df user_id name screen_name location1 1536791610 President Obama POTUS Washington, D.C.2 1339835893 Hillary Clinton HillaryClinton New York, NY3 25073877 Donald J. Trump realDonaldTrump New York, NY4 2303751216 FiveThirtyEight FiveThirtyEight New York, NY5 759251 CNN CNN 6 2557521 ESPN espn Bristol, CT7 783214 Twitter twitter San Francisco, CA...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mkearney/rtweet/issues/35#issuecomment-256753604, or mute the thread https://github.com/notifications/unsubscribe-auth/AAsraOA7PqSsRhkLblOtBq13jD59Gs-xks5q4QPCgaJpZM4KicUl .

peeyooshc commented 7 years ago

Perfect, I'll keep an eye out for the commit, and thanks for making a great successor to twitteR

On 28/10/2016 09:17, "Michael W. Kearney" notifications@github.com wrote:

Thanks for pointing this out! I've updated the documentation (incoming commit), so I'll close this issue. The example provided for lookup_statuses should now look something like this:

statuses <- c("567053242429734913", "266031293945503744", + "440322224407314432")> statuses <- lookup_statuses(statuses)> statuses created_at status_id1 2015-02-15 20:10:02 5.670532e+172 2012-11-07 04:16:18 2.660313e+173 2014-03-03 03:06:13 4.403222e+17 text1 For every retweet this gets, Pedigree will donate one bowl of dog food to dogs in need! \U0001f60a #tweetforbowls http://t.co/z4rmc2HsGT2 Four more years. http://t.co/bAJE6Vom3 If only Bradley's arm was longer. Best photo ever. #oscars http://t.co/C9U5NOtGap source in_reply_to_status_id in_reply_to_user_id in_reply_to_screen_name is_quote_status retweet_count favorite_count lang1 Twuffer NA NA NA FALSE 684247 135915 en2 Twitter Web Client NA NA NA FALSE 837868 466187 en3 Twitter for Android NA NA NA FALSE 3304788 2291832 en user_id screen_name quoted_status_id mentions_user_id mentions_screen_name hashtags urls is_retweet retweet_status_id place_name1 14465607 AHMalcolm NA NA NA tweetforbowls NA FALSE NA 2 813286 BarackObama NA NA NA NA NA FALSE NA 3 15846407 TheEllenShow NA NA NA oscars NA FALSE NA

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mkearney/rtweet/issues/35#issuecomment-256756704, or mute the thread https://github.com/notifications/unsubscribe-auth/AAsraJ_IBaLgiRG4KZNYc70hEDm8g_HAks5q4QbHgaJpZM4KicUl .

mkearney commented 7 years ago

@peeyooshc Once you run the lookup_users() function, use tweets_data() to extract the most recent tweet for each user.

> users <- c("potus", "hillaryclinton", "realdonaldtrump",
  "fivethirtyeight", "cnn", "espn", "twitter")
> usr_df <- lookup_users(users)
> tweets_data(usr_df)

Alternatively, yes, you can also cycle through each user's timeline. The benefit to this approach, is that you can easily get a lot more than the most recent (single) tweet. In a lot of cases, you can get around 3,000 tweets for each user.

> users <- c("potus", "hillaryclinton", "realdonaldtrump",
  "fivethirtyeight", "cnn", "espn", "twitter")
> twt_df <- lapply(users, get_timeline, n = 100)
> twt_df <- do.call("rbind", twt_df)
> head(twt_df)

And thank you! I hope you enjoy the package and please let me know if you run into any more issues!

peeyooshc commented 7 years ago

Sorry Michael,

I think I'm being thick. I've run through the script and it works fine. When I then call tweets_data, the tweets are returned with statusID, but not with user_id or screen_name.

Am I doing something wrong, I looked at extractors.R but couldn't see why screen_name wouldn't be populated.

On Fri, Oct 28, 2016 at 10:42 AM, Michael W. Kearney < notifications@github.com> wrote:

@peeyooshc https://github.com/peeyooshc Once you run the lookup_users() function, use the tweets_data function to see the most recent tweet for each user.

users <- c("potus", "hillaryclinton", "realdonaldtrump", "fivethirtyeight", "cnn", "espn", "twitter")> usr_df <- lookup_users(users)> tweets_data(usr_df)

Alternatively, yes, you can also cycle through each user's timeline. The benefit to this approach, is that you can easily get a lot more than the most recent (single) tweet. In a lot of cases, you can get around 3,000 tweets for each user.

users <- c("potus", "hillaryclinton", "realdonaldtrump", "fivethirtyeight", "cnn", "espn", "twitter")> twt_df <- lapply(users, get_timeline, n = 100)> twt_df <- do.call("rbind", twt_df)> head(twt_df)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mkearney/rtweet/issues/35#issuecomment-256778317, or mute the thread https://github.com/notifications/unsubscribe-auth/AAsraAs5Prt64ihVCV4FFRqzomBAW3a5ks5q4RqqgaJpZM4KicUl .

mkearney commented 7 years ago

You should find user_id and screen_name, but they might be columns 11-13ish. The ourput I posted above from the lookup_tweets function includes the same data organization that you should be seeing. Perhaps some columns were cut off by the print method?

mkearney commented 7 years ago

Okay, ran some tests. You're totally right! I'm looking into it!

peeyooshc commented 7 years ago

Thanks !

On 28/10/2016 15:26, "Michael W. Kearney" notifications@github.com wrote:

Okay, ran some tests. You're totally right! I'm looking into it!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mkearney/rtweet/issues/35#issuecomment-256821010, or mute the thread https://github.com/notifications/unsubscribe-auth/AAsraO-7oZ5J7cSDZtkx0Qd6sLXKumKYks5q4V1rgaJpZM4KicUl .