twintproject / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
15.77k stars 2.73k forks source link

[ISSUE] Tweet stats for any retweet always same as that of its original tweet. #667

Closed rjsu26 closed 4 years ago

rjsu26 commented 4 years ago

Make sure you've checked the following:

Command Ran

import twint import time import datetime import json from operator import itemgetter import os

FILENAME = "testing.json" SEARCH_QUERY = "Trump visits India" SINCE_DATE = datetime.datetime(2020, 2, 14) END_DATE = datetime.datetime(2020, 2, 20) SCRAPE_RETWEETS = True

print("ALERT!! Filename is {} and search query is {}".format(FILENAME,SEARCH_QUERY)) print("Starting in 2 sec...") time.sleep(2)

current_end_date = SINCE_DATE + datetime.timedelta(days=1)

while SINCE_DATE != END_DATE: c = twint.Config() c.Output = FILENAME c.Limit=100 c.Native_retweets = SCRAPE_RETWEETS c.Search = SEARCH_QUERY c.Custom["tweet"] = [ "id", "created_at", "date", "time", "user_id", "username", "tweet", "replies_count", "retweets_count", "likes_count", "hashtags", "retweet", "user_rt_id", "user_rt", "retweet_date", ] c.Resume = os.path.join(os.getcwd(),FILENAME.split(".")[0] + "_resume.raw") c.Count = True c.Lang = "en" c.Store_json = True c.Hide_output = True c.Since = SINCE_DATE.strftime("%Y-%m-%d") c.Until = current_end_date.strftime("%Y-%m-%d") try: twint.run.Search(c) except AttributeError: print("\n[!] found a removed tweet probably\n") SINCE_DATE = current_end_date current_end_date += datetime.timedelta(days=1)



### Description of Issue
When we put **SCRAPE_RETWEETS = true** and **false** individually, an then checking the requested tweets, the likes/replies/retweets count in a tweet with **is_retweet=True** is coming to be the same as that of the original tweet which was being retweeted. 
Example:

![image](https://user-images.githubusercontent.com/32229344/74919684-f66fcd00-53f0-11ea-896d-c0a0fa809d98.png)
The above image is a snapshot showing number of likes, comments and retweets on an original tweet by user 'X'.

![image](https://user-images.githubusercontent.com/32229344/74919719-07b8d980-53f1-11ea-978c-8edfdad1db58.png)
The above image shows number of likes... on a retweet of above user 'X's tweet. As one expects, the stats for this should have been number of likes.... on my own post, rather than the stats of the retweets' original post. 

### Environment Details
>Using VsCode in Ubuntu 18
pielco11 commented 4 years ago

As of now, the information that you are looking for is not available (read Twitter does not give it)

rjsu26 commented 4 years ago

I really doubt about the existence of this problem here. I am afraid if my problem is clear to you or not. Because, since the Html attributes given for each case is the same, so in case of a retweet or a tweet, the stats of that particular tweet should only be given. In fact, getting the stats of the post I retweeted would be going an extra mile.

pielco11 commented 4 years ago

The likes/replies/retweets count that you see in a retweeted tweet, are of the original tweet and not the retweeted one