twintproject / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
15.74k stars 2.72k forks source link

[QUESTION] How can I just scrape a user's tweets without all the replies and without all the retweets? '-u username' gives me all the tweets and all the replies. Thank you! #1387

Open Nesu1313 opened 2 years ago

Nesu1313 commented 2 years ago

I am trying to pull Elon Musk's Tweets without any replies and without any retweets, just the Tweets.

I have the latest python version and updated Twint working with MacOS

I used the following commands but this gives me all the tweets from Elon Musk and all the replies which I am trying to leave out.

import twint

c = twint.Config()

c.Username = "elonmusk" c.Store_csv = True c.Limit = 99999999 c.Output = "./elon_musk_tweets.csv" c.Hide_output = True

twint.run.Search(c)

I hope someone could help me out. Thank you very much

BashairA commented 2 years ago

Are you using jupyter notebook? Is twint working ? I return with zero result

Nesu1313 commented 2 years ago

Hey BashairA. Thank you for your reply :). I am not using Jupyter Notebook. I have not used Jupiter before. Do you have any ideas how I could filter out the replies ? When scraping all tweets of a username like Elon Musk, Twint gives me all of his tweets but it also scrapes all the replies from Elon Musk to other people, which I am trying to leave out.

Twint itself is working just fine. I used it with python.

Lsx-8621 commented 2 years ago

Hey BashairA. Thank you for your reply :). I am not using Jupyter Notebook. I have not used Jupiter before. Do you have any ideas how I could filter out the replies ? When scraping all tweets of a username like Elon Musk, Twint gives me all of his tweets but it also scrapes all the replies from Elon Musk to other people, which I am trying to leave out.

Twint itself is working just fine. I used it with python.

You could add a column name 'reply_to'. If the tweet is 'reply', it will be an empty list '[]'. Otherwise, it will be like '[{screen_name:'XXX','name':'XXX','id':ddddddd}]'. So based on this, you can easily filter out 'replies'.