twintproject / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
15.68k stars 2.71k forks source link

Cannnot save into sql #984

Open ch4c4l opened 3 years ago

ch4c4l commented 3 years ago

Issue Template

Please use this template!

Initial Check

If the issue is a request please specify that it is a request in the title (Example: [REQUEST] more features). If this is a question regarding 'twint' please specify that it's a question in the title (Example: [QUESTION] What is x?). Please only submit issues related to 'twint'. Thanks.

Make sure you've checked the following:

Command Ran

Please provide the exact command ran including the username/search/code so I may reproduce the issue.

Description of Issue

Cannnot save into sql in sqlite3

Environment Details

Ubuntu 20_04 Terminal

+] Inserting into Database: aitor.sql Traceback (most recent call last): File "/home/xxxxxx/.local/bin/twint", line 8, in sys.exit(run_as_command()) File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/cli.py", line 313, in run_as_command main() File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/cli.py", line 305, in main run.Search(c) File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/run.py", line 427, in Search run(config, callback) File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/run.py", line 319, in run get_event_loop().run_until_complete(Twint(config).main(callback)) File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete return future.result() File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/run.py", line 239, in main await task File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/run.py", line 290, in run await self.tweets() File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/run.py", line 230, in tweets await output.Tweets(tweet, self.config, self.conn) File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/output.py", line 175, in Tweets await checkData(tweets, config, conn) File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/output.py", line 144, in checkData db.tweets(conn, tweet, config) File "/home/xxxxxx/.local/lib/python3.8/site-packages/twint/storage/db.py", line 285, in tweets if Tweet.retweet: AttributeError: 'tweet' object has no attribute 'retweet'

twomoreweeks commented 3 years ago

There are a few issues with the database code at the moment. In tweet.py, the code that gets retweets information to save into the retweets table is commented out (hence why your program crashed) and the code that gets replies information is bugged - we fetch the immediate user_id of whoever the tweet is in reply to but the code in db.py treats the reply information as an array even though it's just a single object. Namely, for reply in Tweet.reply_to: will just loop over the keys and values in the reply_to object and the program will crash. For the replies bug, you'll need to wait for the author to decide whether or not they want to save the list of all users a tweet is in reply to or not.

Check https://github.com/Uclidean/twint/blob/master/twint/tweet.py for a quick fix(?), though I am brand new to the repo so don't put too much faith into it. From what I've tested it works OK.

himanshudabas commented 3 years ago

@Uclidean Reason for commenting out the code: When twitter deprecated the older v1.1 endpoints, I put up a quick fix to revive twint (yes API deprecation broke twint). But at that time I myself was new to the project, so I did whatever I could and comented out the code which would cause problem with the newer endpoints / which I wasn't sure how to handle at that time.

Although I have explored the library now, and do know how to fix them, but the owner of the project isn't active, so theres no point in providing fixes, as my PR doesn't get any response at all.

twomoreweeks commented 3 years ago

@himanshudabas

the owner of the project isn't active, so theres no point in providing fixes, as my PR doesn't get any response at all.

Are there any decent forks that you know of?

himanshudabas commented 3 years ago

@Uclidean There is another project called snscrape. Which is quite decent. Not a fork of this project but works really great. Much faster than twint. But there isn't a lot of functionality in it. Although you can get the basic things done.