cjbarrie / academictwitteR

Repo for academictwitteR package to query the Twitter Academic Research Product Track v2 API endpoint.
Other
272 stars 59 forks source link

[BUG] get_retweeted_by produces 400 error #287

Closed t-davidson closed 2 years ago

t-davidson commented 2 years ago

Please confirm the following

Describe the bug

I've been trying to use the get_retweeted_by function but keep getting 400 errors. The function runs for tweets without any retweets but returns a 400 error as soon as it processes the first batch of retweets on a tweet with one or more retweets.

This is what the output looks like for the example below:

retweeters <- get_retweeted_by(test$id[1], bearer_token = get_bearer(), verbose = TRUE)
Processing 1476155918597373952
Total pages queried: 1 (tweets captured this page: 93).
Error in make_query(url = endpoint_url, params = params, bearer_token = bearer_token,  : 
  something went wrong. Status code: 400

I have also been running the get_liking_users function and have had no problems, despite the identical syntax, so I'm confident there is not an error in my input. I have tried this on several different Twitter accounts, have restarted R multiple times, and have installed the latest version of academictwitteR from Github. I get the same error every time.

I looked at the source code for this function but nothing immediately jumps out to me. I would be interested to see if anyone else can reproduce the error.

Expected Behavior

The function should return a data frame containing a list of users who retweeted tweets in the supplied vector of tweet IDs.

Steps To Reproduce

The following code reproduces the error:

library(academictwitteR)
library(jsonlite)

creds <- read_json("../creds.json")  # Loading credentials

# Getting example tweets
test <- get_all_tweets(user = c("uklabour"), 
                       start_tweets = "2021-01-01T00:00:00Z", 
                       end_tweets = "2022-01-01T00:00:00Z",
                       is_retweet = FALSE,
                       bearer_token = get_bearer(), 
                       n=100)

# We can get the liking users without any issues. Returns 422 rows
likers <- get_liking_users(test$id[1], bearer_token = get_bearer(), verbose = TRUE)

# This query runs but fails after the first update
retweeters <- get_retweeted_by(test$id[1], bearer_token = get_bearer(), verbose = TRUE)

Environment

R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.6

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] jsonlite_1.7.3        academictwitteR_0.3.0

loaded via a namespace (and not attached):
 [1] magrittr_2.0.2   usethis_2.1.5    tidyselect_1.1.1 R6_2.5.1        
 [5] rlang_1.0.1      fansi_1.0.2      httr_1.4.2       dplyr_1.0.8     
 [9] tools_4.1.2      utf8_1.2.2       cli_3.1.1        DBI_1.1.1       
[13] ellipsis_0.3.2   assertthat_0.2.1 tibble_3.1.6     lifecycle_1.0.1 
[17] crayon_1.4.2     purrr_0.3.4      vctrs_0.3.8      fs_1.5.2        
[21] curl_4.3.2       glue_1.6.1       compiler_4.1.2   pillar_1.7.0    
[25] generics_0.1.2   pkgconfig_2.0.3 

Anything else?

No response

t-davidson commented 2 years ago

It seems like Twitter made some changes to the likes and retweets endpoints recently. This might be the reason for the issue: https://twittercommunity.com/t/updates-to-retweets-lookup-and-likes-lookup-endpoints/165327

chainsawriot commented 2 years ago

@t-davidson Thanks for reporting this. I confirm that I can reproduce this. And I will look into your PR (Thanks also!)

require(academictwitteR)
#> Loading required package: academictwitteR
test <- get_all_tweets(user = c("uklabour"), 
                       start_tweets = "2021-01-01T00:00:00Z", 
                       end_tweets = "2022-01-01T00:00:00Z",
                       is_retweet = FALSE,
                       n = 100)
#> Warning: Recommended to specify a data path in order to mitigate data loss when
#> ingesting large amounts of data.
#> Warning: Tweets will not be stored as JSONs or as a .rds file and will only be
#> available in local memory if assigned to an object.
#> query:   (from:uklabour) -is:retweet 
#> Total pages queried: 1 (tweets captured this page: 499).
#> Total tweets captured now reach 100 : finishing collection.
likers <- get_liking_users(test$id[1], verbose = TRUE)
#> Processing 1476155918597373952
#> Total data points:  91 
#> Total data points:  180 
#> Total data points:  268 
#> Total data points:  351 
#> Total data points:  421 
#> Total data points:  422 
#> This is the last page for  1476155918597373952 : finishing collection.
retweeters <- get_retweeted_by(test$id[1], verbose = TRUE)
#> Processing 1476155918597373952
#> Total pages queried: 1 (tweets captured this page: 93).
#> Error in make_query(url = endpoint_url, params = params, bearer_token = bearer_token, : something went wrong. Status code: 400

Created on 2022-02-09 by the reprex package (v2.0.1)

chainsawriot commented 2 years ago

Fixed by 5245710

cjbarrie commented 2 years ago

Late to the party but thank you for this PR @t-davidson ! And thanks for adding unit tests @chainsawriot

t-davidson commented 2 years ago

Thanks @chainsawriot @cjbarrie for the quick response and integration of the PR - and for making and maintaining such a great package!