Open aliiabbasi opened 4 years ago
hey @aliiabbasi
I am currently trying to scrape tweets containing the word "coronavirus" for sentiment analysis purposes and I can explain my workaround, maybe it could give you some advices :)
I tried to download every tweets since January but it didn't work well... 24h wasn't enough and maybe splitting the request for each day is the great way to do it but I don't want to wait approximately 2 days (the full request for one day lasted 1h!). So what I am doing now is scraping each day 1000 "top tweets"
What I'm observing is :
example :
I'm still trying to find ways to analyze covid-19 related tweets though, if you find better ways to scrape a lot of tweets don't hesitate to comment
How is the topTweets criteria calculated? Is it a direct sort of retweets or a combination of retweets, favourites and replies? @aliiabbasi
@anshumanchak The topTweets criteria is determined by twitter. You can enable twitter to show you the tweets he consider more relevant, normally are those having high engagement with the network.
Thanks @Victorpc98
Try to to similar things. For 1000 tweets, it took me 2min 35 seconds. So imaging 1 million tweets... One way that I can think of:
Hi @julienbeisel
I'm currently trying to scrape tweets for sentiment analysis. I'm also using the setNear("County in Ireland") and setQuerySearch("Covid19" or "covid" or "corona" or ......), but not getting many tweets. Is there a way to scrape more tweets? Thanks
Hi @SRaina11
When I did this project I had the same issue. If I remember well, you can query tweets from short time periods and then have more tweets. For example, instead of getting the tweets from the last 6 month, you can divide it into 6x4 queries corresponding to the tweets each week, or into 6x4x7 queries corresponding to the tweets each day. Sometimes it didnt work well and it stopped working after some queries, you can maybe wait before each query so the API does not stop your queries.
I hope it'll help you!
Hi @SRaina11
When I did this project I had the same issue. If I remember well, you can query tweets from short time periods and then have more tweets. For example, instead of getting the tweets from the last 6 month, you can divide it into 6x4 queries corresponding to the tweets each week, or into 6x4x7 queries corresponding to the tweets each day. Sometimes it didnt work well and it stopped working after some queries, you can maybe wait before each query so the API does not stop your queries.
I hope it'll help you!
Thank you for your response @julienbeisel. Would you know if we can use OR operator and try to find tweets for multiple keywords. For eg: tweetCriteria.setQuerySearch("Covid19" or "covid" or "corona")?
Cheers
I wanted to know what exactly the criteria are for the extracted Toptweets? Is it the likes or the number of retweets? And also, is it possible to extract a large body of tweets (e.g. a million tweets) using this library? I tried it for almost 800K tweets, but it neither respond nor it did error!