cjbarrie / academictwitteR

Repo for academictwitteR package to query the Twitter Academic Research Product Track v2 API endpoint.
Other
272 stars 59 forks source link

not very long query (less than 1024 characters) returns 400 error #314

Closed polisci-quant-nerd closed 2 years ago

polisci-quant-nerd commented 2 years ago

Please confirm the following

Describe the bug

Hi, I built a query that worked perfectly before. But since this month, when I try to get more tweets based on the same query, I got the 400 error. I don't know why the query stopped working. I checked how many characters it has, only 518. Just some terms with AND OR operators. And I don't think it contains any non-recognisable terms. And it doesn't matter how big n I set; it always returning the same error. I just couldn't figure out what's the problem with my old query.

query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus or \"public health\"))", is_retweet = FALSE, lang = "en")

tweet_mar_aug_2020<-get_all_tweets(bearer_token=bearer_token_granted,
                                   query = query,
                                   start_tweets="2020-03-01T00:00:00Z", 
                                   end_tweets="2020-08-01T00:00:00Z", 
                                   bind_tweets = FALSE,
                                   file = "mar_aug_2020_eng",
                                   data_path = "~/covid/1_Data/tweet_json/",
                                   n = Inf)

Expected Behavior

it shoud be working as before.

Steps To Reproduce

> query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus or \"public health\"))",  is_retweet = FALSE, lang = "en")

tweet_mar_aug_2020<-get_all_tweets(bearer_token=bearer_token_granted,
                                    query = query,
                                    start_tweets="2020-03-01T00:00:00Z", 
                                    end_tweets="2020-08-01T00:00:00Z", 
                                    bind_tweets = FALSE,
                                    file = "EU_mar_aug_2020_eng",
                                    data_path = "~/Desktop/1_Data/tweet_json/",
                                    n = Inf)
query:  ((#EU OR #EuropeanUnion OR "EU" OR "European Union" OR "EuropeanUnion" OR "European Comission" OR #EuropeanComission OR #EUCommission OR "European Central Bank" OR "ECB" OR #ECB OR @ECB OR @EU_Comission OR "European Parliament" OR "European Council" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR "corona" OR "covid" OR "pandemic" OR "coronavirus" OR #pandemic OR #coronavirus or "public health")) -is:retweet lang:en 
Error in make_query(url = endpoint_url, params = params, bearer_token = bearer_token,  : 
  something went wrong. Status code: 400
In addition: Warning messages:
1: Tweets will still be bound in local memory to generate .rds file. Argument (bind_tweets = FALSE) only valid when just a data path has been specified. 
2: Directory already exists. Existing JSON files may be parsed and returned, choose a new path if this is not intended. 

Environment

R version 4.1.1 (2021-08-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.3

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

Anything else?

No response

cjbarrie commented 2 years ago

I believe the error is in the last or, which should be capitalized as OR. The following works:

query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus OR \"public health\"))", is_retweet = FALSE, lang = "en")

tweet_mar_aug_2020<-get_all_tweets(query = query,
                                   start_tweets="2020-03-01T00:00:00Z", 
                                   end_tweets="2020-08-01T00:00:00Z", 
                                   n = Inf)