Closed Reina-Orikasa closed 2 years ago
Twitter's documentation states that:
Bounding boxes do not act as filters for other filter parameters. For example track=twitter&locations=-122.75,36.8,-121.75,37.8 would match any Tweets containing the term Twitter (even non-geo Tweets) OR coming from the San Francisco area.
So you will get results either containing the words in the track
parameter OR within the United States/your coordinates if I am understanding it correctly. Leading to the mass influx of irrelevant tweets to the track
parameter.
If I remove the locations parameter, it starts to return tweets containing the track=['']
keyword ONLY. However, this creates the problem that tweets are being pulled globally instead of the United States only.
Here is the code block I used:
stream_listener = StreamListener(time_limit=200, file=output_file)
stream = tweepy.Stream(auth=myauth, listener=stream_listener)
stream.filter(locations=LOCATIONS, languages=['en'], encoding="utf-8", track=['Russia'])
When searching the results, only 2 of the 623 tweets contain the word 'Russia'. The rest are irrelevant to Russia. The results are similar if I use
track=['Ukraine']
.I noticed similar issues here: https://github.com/jakobzhao/geog458/issues/13 and here: https://github.com/jakobzhao/geog458/issues/10 but no followup in either