Closed jingwenshi-dev closed 1 year ago
I'm not seeing this; retweets appear correctly under retweetedTweet
here. Please provide a complete reproducible example.
import pandas as pd
from snscrape.modules.twitter import TwitterUserScraper
scraper = TwitterUserScraper('UofT')
result = []
for i, item in enumerate(scraper.get_items()):
result.append(item)
pd.DataFrame(result).to_csv('UofT.csv', index=False)
Almost all the retweets in UofT's twitter account is a retweet but not quote. But if you use this code to scrape the tweets, the retweetedTweet column in the CSV file is empty and appears in quotedTweet column.
The TwitterUserScraper
uses the search and never returns retweets, only original and quote tweets. There used to be a filter to enable seeing retweets from the past 7 days, but this is no longer available since the API switch: #887
If you use the TwitterProfileScraper
, you get retweets, correctly populated.
Like you said "it never returns retweets". But, it will still return retweets at my end and put the retweets in the quotedTweet column.
Any examples of tweets returned like that? The first result with a non-empty retweetedTweet
or quotedTweet
I get is https://twitter.com/UofT/status/1648688150867243009, which is indeed a quote tweet.
Oh nvm, I was looking at the wrong dataset, sorry about that.
Describe the bug
When I am testing the code and lib on my own Twitter account, I found that all retweets will be categorized into quotedTweet (i.e. the retweetedTweet is always empty).
How to reproduce
Just scrape a small account
Expected behaviour
Retweets will be under retweetedTweet column or attribute.
Screenshots and recordings
No response
Operating system
Windows 10
Python version: output of
python3 --version
3.11
snscrape version: output of
snscrape --version
0.7.0.20230622
Scraper
TwitterUserScraper
How are you using snscrape?
Module (
import snscrape.modules.something
in Python code)Backtrace
No response
Log output
No response
Dump of locals
No response
Additional context
No response