Open patrickcmbooth opened 5 years ago
I am also getting the same result. I hope that is not a twitter restriction.
I am also facing a similar issue. The number of users returned with --user
tag is way less than the actual number of tweets.
I scrolled through realDonaldTrump
twitter page for a long time, Twitter is only loading ~800 tweets (till Mar 17) and stops loading old tweets thereafter. It seems like a restriction from twitter.
With this query I got 15000+ tweets and it is still querying... It could be that twitter is indeed blocking some ip addresses or user agents.
Can you try using different user agents HEADERS to see if this allows you to query?
I am having a similar issue. or worst. I put in 'twitterscraper realDonaldTrump --user --output=tweets15.json' in the command line. No tweet was scrapped. Changing the user name does not seem to work neither
Would be best if you use getoldtweets3 library. [https://github.com/Mottl/GetOldTweets3].
works perfectly for me. I had issues initialising this taspinar library
@marquisvictor Hi there, I tried to load getoldtweets3. It says that it is not recognized. Would you kindly help me out?
Yeah i encountered this same issue at first.. Here's what worked for me.
I navigated to the directory/folder of Getoldtweets3.
Once i'm inside the Getoldtweets3 folder, look for another folder called "Bin". open it, you should see a pycache folder, and a Getoldtweets3 file.
Copy just the Getoldtweet3 file, and go back a step to the Getoldtweets folder, Paste it there, and rename it to "Getoldtweets.py"
Then fire up your command prompt in that same folder, i hope you're computer savvy enough to do that,. well, if not, just let me know, i'll be glad to show you how.
after opening up the command prompt, you paste the following code
py -3.6 GetOldTweets3.py --username "barackobama" --since 2015-09-10 --until 2015-09-12 --maxtweets 10
please be wary of the "py -3.6" argument.. If you have set python path as an environment variable such that you can call it from cmd, just do
python GetOldTweets3.py --username "barackobama" --since 2015-09-10 --until 2015-09-12 --maxtweets 10
and that should work just fine. I've been pilling up backdated tweets for the past two weeks now. trying to garner enough big data for my analysis. please let me know if you have any issues going forward.
@RishengP
@marquisvictor Yay, it works!!!!Thanks man, I really appreciate it. Hey real quick. Do you by any chance know any package can help us pull out info on retweets, such as retweeter id/ retweeter usernames?
@RishengP Nahh, i don't.. You would have to manually code that yourself, in the tweetManager file.
I think the reason GetOldTweets3 returns more in this case is because it's searching the timeline using a from: query instead of going to the user's page. See https://github.com/Mottl/GetOldTweets3/blob/master/GetOldTweets3/manager/TweetManager.py#L142. The equivalent with this repo would be to do something like this instead: twitterscraper from:realDonaldTrump --output=tweets15.json
With this query I got 15000+ tweets and it is still querying... It could be that twitter is indeed blocking some ip addresses or user agents.
Can you try using different user agents HEADERS to see if this allows you to query?
Hey,Could you please tell me how to deal this issue in detail ? Thanks!
@marquisvictor Yay, it works!!!!Thanks man, I really appreciate it. Hey real quick. Do you by any chance know any package can help us pull out info on retweets, such as retweeter id/ retweeter usernames?
Hey, were you able to get that?
The profile page only shows the last ~800 tweets. That's why the scraper does not scrap more :-/
@mawic try my suggestion above
@neon-ninja: Thanks. It partially helps me. However, it doesn't return the retweets from the user, which I'm looking for. But I assume that there won't be any other solution besides the official API.
@mawic you can use the include:nativeretweets operator to get retweets in search results. For example:
twitterscraper "from:realDonaldTrump include:nativeretweets" --output=tweets15.json -p 1
I'm using this command
twitterscraper realDonaldTrump --user --output=tweets15.json
but I can only get 800 tweets every time, when he has definitely more than 800 tweets. I've tried it with another twitter user and it also returned 800 or so tweets only.