Closed afterdelight closed 5 years ago
Hi afterdelight i am currently at work and dont have a windows pc availible to try to duplicate your problem. But i found this: https://github.com/pmbarrett314/curses-menu/issues/18 Apparently Python for windows doesn't ship with the curses library. I am getting rid of the curses output in the twint-utils version and replacing it with a logger since the curses output is currently very buggy.
Okay, thanks. After installing curses, I got this: ModuleNotFoundError: No module named 'fcntl'
maybe try this module (if you haven't already) windows-curses. you can install it using "pip install windows-curses" If that doesn't work i have another version That uses a logger instead of a curses. but the logger is still not completely finished.
I already tried both and didn't work, windows-curses and menu-curses. I'll wait for the logger version. Btw how about the implementation progress with twint?
I just fixed the logging version up for you and did a quick testrun on my Windows box using python 3.7 using the modules from the requirements.txt. It should work now. I know the logger outputs double. Thats a bug and i am working on it. If you know how to fix it help is appreciated. If you get to many requests errors try cranking up the sleep timer at line 151. If you are downloading a big number of accounts in bulk and you lose your internet connection or something delete the the resume files of the accounts that are not or not completely downloaded otherwise the twint will start where it exited the privious time and you dont want that because then you skip a large part of the media you want to download. As for the Twint implementation i have to fix the logging and remove the twint function from the twitter class and then it should be done. But i don't have allot of free time because of work. Let me know if it works and il close the issue.
Cheers, Philip
I forgot to mention that it's on the logger branch
Well, I tried python3 core.py list.txt bit only got a bunch of log files and no medias. Is it a bug?
I think i have fixed it. can you give it another try?
Got this error now
CRITICAL:root:twint.get:User:'NoneType' object is not subscriptable
Traceback (most recent call last):
File "core.py", line 104, in
in my list.txt contains one https://twitter.com/twitter/** link per line.
oke so i just reverted the changes i just made. I think it was working corect after all. Can you give it another try and check if you see in the output or in the logs Crawling: 'username'
Yes, it's crawling but I only got the text logs not the pictures and videos.
Ah oke then its working as intended. First it gets all the tweets of the first user. Once the tweets are collected it appends them to a download queue and it starts downloading. After the first account is crawled the crawling and downloading is going simultaneously. With the crawler getting media tweets and appending them to the download queue and the downloader emptying the queue. Just be patient and after it collected the first batch of tweets the downloader spins up. The program is done after it reports "Done downloading!"
Once the program is done crawling its first batch and it starts downloading you'll see a message that says: "account: downloading blabla photos and blabla videos crawling nextaccount"
Before it's fixed there is no "account: downloading blabla photos and blabla videos" messagge. Still crawling the new one.
Does the program end before downloading? Crawling can take hours sometimes depending on the amount of tweets and your internet connection.
I got this error and no media just now: INFO:twitter_media_downloader.module:crawling *** CRITICAL:root:twint.get:User:'NoneType' object is not subscriptable 2019-10-03 21:38:15,556 - twitter_media_downloader.module - INFO - 0 items left in crawler queue INFO:twitter_media_downloader.module:0 items left in crawler queue 2019-10-03 21:38:15,556 - twitter_media_downloader.module - INFO - Done crawling! INFO:twitter_media_downloader.module:Done crawling!
Thare are 3 folders created: logs, resume and twitter. The twitter one is empty.
'twitter_media_downloader.module - INFO - crawling usernamestatusidstatus'
the output log is missing https:// and / for each between username status idstatus
I thinks thats why it was resulting 'CRITICAL:root:twint.get:User:'NoneType' object is not subscriptable' error
The twitter folder is where the downloaded media ends up. The resume folder is where twint keeps the resume files. You should remove the resumefiles if you want to restart after an unsuccesfull run. Sometimes twint trows the 'CRITICAL:root:twint.get:User:'NoneType' object is not subscriptable' error. I believe this only affects a single tweet. I just did another test run on my windows pc using python 3.7. I created a text file with the url to twitters twitter page ("https://twitter.com/twitter") (This program only accepts urls to an account page.) This is the output.
PS V:\twitter_media_downloader> python.exe .\core.py .\list.test
2019-10-03 20:17:09,420 - twitter_media_downloader.module - INFO - Starting twitter module with a queue size of 1
2019-10-03 20:17:09,422 - twitter_media_downloader.module - INFO - crawling twitter
2019-10-03 20:18:38,088 - twitter_media_downloader.module - INFO - 0 items left in crawler queue
INFO:twitter_media_downloader.module:0 items left in crawler queue
2019-10-03 20:18:38,088 - twitter_media_downloader.module - INFO - twitter: downloading 311 photos and 1579 videos
INFO:twitter_media_downloader.module:twitter: downloading 311 photos and 1579 videos
2019-10-03 20:18:38,089 - twitter_media_downloader.module - INFO - Done crawling!
INFO:twitter_media_downloader.module:Done crawling!
Within minutes it started downloading media.
Well its kinda useless for me because I used twint to scrape the output resluted in sepcific status https://twitter.com/twitteruser/status/xxxxxxxx
Btw does the script support downloading retweet?
No it just downloads all the media of an account.
I close the issue now because the program is working as intended.
When I run core.py with a list I got:
Traceback (most recent call last): File "core.py", line 31, in
from twitter import Twitter
File "*\twitter.py", line 42, in
from utils import
File " \utils.py", line 25, in
from blessings import Terminal
File "C:\Python37\lib\site-packages\blessings__init.py", line 5, in
import curses
File "C:\Python37\lib\curses\ init__.py", line 13, in
from _curses import *
ModuleNotFoundError: No module named '_curses'
I'm using windows 10 btw.