cjbarrie / academictwitteR

Repo for academictwitteR package to query the Twitter Academic Research Product Track v2 API endpoint.
Other
272 stars 59 forks source link

is_retweet=FALSE not working #155

Closed benji700 closed 3 years ago

benji700 commented 3 years ago

is_retweet=FALSE within build_query doesn't seem to be working for me using either the developer version of the package or the standard version.

Here's the code I'm using:

mh_query <- build_query(query="#mentalhealth OR mental health", is_retweet=F)

tweets_mh <- get_all_tweets( query=mh_query, "2021-06-07T09:55:00Z", "2021-06-07T10:00:00Z", bearer_token,data_path = 'data/',n=50 )

which produces a dataset that includes retweets and the following output: "query: <#mentalhealth OR mental health -is:retweet>: (tweets captured this page: 150). Total pages queried: 1. Total tweets ingested: 150. Amount of tweets exceeds 50 : finishing collection. Warning messages: 1: Tweets will be bound in local memory as well as stored as JSONs. 2: Directory already exists. Existing JSON files may be parsed and returned, choose a new path if this is not intended."

Thanks in advance! Benji

chainsawriot commented 3 years ago

Short answer: please try this:

mh_query <- build_query(query="(#mentalhealth OR \"mental health\")",
is_retweet=F)

The long answer: OR query must be quoted. The way you did generates a query like this: #mentalheath OR mental health -is:retweet. Twitter will interpret this as 1) tweets with "#mentalhealth" OR 2) tweets with "mental" AND 3) tweets with "health" that are not retweets.

Please consult the API doc on how to build a query.

chainsawriot commented 3 years ago

ref #136

cjbarrie commented 3 years ago

Updated with vignette in #158