Closed schliebs closed 2 years ago
@schliebs Thanks for reporting this. I can give a plausible explanation.
I run this query.
xx <-
get_all_tweets(
query = "RussianEmbassy",
start_tweets = "2022-03-01T00:00:00Z",
end_tweets = "2022-03-09T23:59:59Z",
n = 100)
But I run it with: debug(academictwitteR:::make_query)
and look at the response code for each individual query. It fails around 60% of the time in this Sunday afternoon and for those failed cases, the response code is 503 (overcapacity). As it is an accepted code and academictwitteR will retry with that. When it reached max_error
(default to 4 tries), it gave "Too many errors".
It is more likely to fail with queries related to the invasion. For some relatively harmless things such as #ichbinhanna
, it is always 200. This is just a hypothesis: Twitter restricted their API capacity based on the query.
Whether or not this counts as "a package-level issue" is of course debatable (one can blame the four-strikes rule, but that's not the root cause). But please try to break down the query into small pieces. At this time all over the internet, any thing can easy be overcapacity.
@schliebs One way to increase the success rate is to reduce the page_n
from the default 500 to 100.
xx <-
get_all_tweets(
reply_to = "RussianEmbassy",
start_tweets = "2022-03-01T00:00:00Z",
end_tweets = "2022-03-09T23:59:59Z",
n = Inf,
page_n = 100)
It will be > 5x slower, though.
Please confirm the following
something went wrong. Status code: 400.
Describe the bug
When querying replies to/mentions of certain highly replied to accounts (e.g. @mfa_russia), I get the error message "too many errors". Note that this does not happen for other accounts (e.g. "to:JoeBiden" or reply_to = "JoeBiden") or for shorter time periods. Also, this error does not occur when running the same API requests manually, so it must be a package-level issue. I have included a manual implementation of the same queries where the error does not occur below the reproducible examples (example 1b and 2b).
Expected Behavior
Return the replies/mentions as usual. I ran the exact same queries through the API manually and did not get any three-digit errors (only 200 responses), so this seems to be something happening within the academictwitteR package.
Steps To Reproduce
Example 1 (Reply to):
Created on 2022-03-18 by the reprex package (v2.0.1)
Example 2 (mentions):
Created on 2022-03-18 by the reprex package (v2.0.1)
Example 1b (working manual implementation):
Created on 2022-03-18 by the reprex package (v2.0.1)
Example 2b (working manual implementation):
Created on 2022-03-18 by the reprex package (v2.0.1)
Environment
R version 4.1.3 (2022-03-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.4 LTS
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods
[7] base
other attached packages: [1] lubridate_1.8.0 academictwitteR_0.3.1 [3] forcats_0.5.1 stringr_1.4.0
[5] dplyr_1.0.8 purrr_0.3.4
[7] readr_2.1.2 tidyr_1.2.0
[9] tibble_3.1.6 ggplot2_3.3.5
[11] tidyverse_1.3.1
loaded via a namespace (and not attached): [1] Rcpp_1.0.8 cellranger_1.1.0 pillar_1.7.0
[4] compiler_4.1.3 dbplyr_2.1.1 tools_4.1.3
[7] bit_4.0.4 jsonlite_1.8.0 lifecycle_1.0.1 [10] gtable_0.3.0 pkgconfig_2.0.3 rlang_1.0.2
[13] reprex_2.0.1 rstudioapi_0.13 DBI_1.1.2
[16] cli_3.2.0 curl_4.3.2 parallel_4.1.3
[19] haven_2.4.3 xml2_1.3.3 withr_2.5.0
[22] httr_1.4.2 fs_1.5.2 generics_0.1.2
[25] vctrs_0.3.8 hms_1.1.1 bit64_4.0.5
[28] grid_4.1.3 tidyselect_1.1.2 glue_1.6.2
[31] R6_2.5.1 fansi_1.0.2 readxl_1.3.1
[34] vroom_1.5.7 tzdb_0.2.0 modelr_0.1.8
[37] magrittr_2.0.2 usethis_2.1.5 backports_1.4.1 [40] scales_1.1.1 ellipsis_0.3.2 rvest_1.0.2
[43] assertthat_0.2.1 colorspace_2.0-2 utf8_1.2.2
[46] stringi_1.7.6 munsell_0.5.0 broom_0.7.12
[49] crayon_1.5.0
Anything else?
No response