twintproject / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
15.68k stars 2.71k forks source link

[QUESTION] Only bio returned in .csv or .json? #328

Closed rodolfo-viana closed 5 years ago

rodolfo-viana commented 5 years ago

Issue Template

Please use this template!

Initial Check

If the issue is a request please specify that it is a request in the title (Example: [REQUEST] more features). If this is a question regarding 'twint' please specify that it's a question in the title (Example: [QUESTION] What is x?). Please only submit issues related to 'twint'. Thanks.

Make sure you've checked the following:

Command Ran

Please provide the exact command ran including the username/search/code so I may reproduce the issue.

import twint as tw
i = tw.Config()
i.Username = "Comprova"
i.User_full = True
i.Store_json = True
i.Custom["id"] = ["id"]
i.Custom["name"] = ["name"]
i.Custom["username"] = ["username"]
i.Custom["bio"] = ["bio"]
i.Custom["location"] = ["location"]
i.Custom["url"] = ["url"]
i.Custom["join_date"] = ["join_date"]
i.Custom["join_time"] = ["join_time"]
i.Custom["tweets"] = ["tweets"]
i.Custom["following"] = ["following"]
i.Custom["followers"] = ["followers"]
i.Custom["likes"] = ["likes"]
i.Custom["media"] = ["media"]
i.Custom["private"] = ["private"]
i.Custom["verified"] = ["verified"]
i.Custom["avatar"] = ["avatar"]
i.Output = "data.json"
tw.run.Following(i)

Description of Issue

Please use as much detail as possible. I am trying to get information on accounts a specific user follows such as bio, date of joining Twitter, quantity of tweets etc. I tried the code above and it showed the results I expected on screen (image 1), but the data were not saved in csv or json as it were supposed to. Instead I got only the bio field information (image 2).

Environment Details

Using Windows, Linux? What OS version? Running this in Anaconda? Jupyter Notebook? Terminal?

Win10, Jupyter Lab, no Anaconda.

img1 img2

pielco11 commented 5 years ago

Hi @rodolfo-viana

config.Custom is a dict with only tweet, user and username as keys https://github.com/twintproject/twint/blob/653de504f7a7d65ec1dabf0cc67b141555780009/twint/config.py#L20 Those three are the entities returned, tweet stands for tweet (so the content, the date, the user that tweeted, etc), user stands for the user which you are scraping (so the bio, the number of following/followers, join date, etc.) and username stands for the username of the follower/following that you scraped from a target user.

It seems that you are interested in almost every field of a user object, so you do not need to specify that you want every field since this is the default situation.

Just remove these lines:

i.Custom["id"] = ["id"]
i.Custom["name"] = ["name"]
i.Custom["username"] = ["username"]
i.Custom["bio"] = ["bio"]
i.Custom["location"] = ["location"]
i.Custom["url"] = ["url"]
i.Custom["join_date"] = ["join_date"]
i.Custom["join_time"] = ["join_time"]
i.Custom["tweets"] = ["tweets"]
i.Custom["following"] = ["following"]
i.Custom["followers"] = ["followers"]
i.Custom["likes"] = ["likes"]
i.Custom["media"] = ["media"]
i.Custom["private"] = ["private"]
i.Custom["verified"] = ["verified"]
i.Custom["avatar"] = ["avatar"]
rodolfo-viana commented 5 years ago

Thank you, @pielco11, for taking some time to explain it. :) The problem is... I did no specification on my first try and got just the bio. So I decided to specify each and every field and again got only bio returned. I will check what I did wrong and let you know soon. Once again, thanx.

rodolfo-viana commented 5 years ago

Found it! I had cloned the repo some time ago and forgot to upgrade it. My bad, @pielco11. And thanks for your help.