11philip22 / TwitterMediaDownloader

downloads photos and videos from twitter
MIT License
17 stars 4 forks source link

Error: No module named '_curses' #2

Closed afterdelight closed 5 years ago

afterdelight commented 5 years ago

When I run core.py with a list I got:

Traceback (most recent call last): File "core.py", line 31, in from twitter import Twitter File "*\twitter.py", line 42, in from utils import File "\utils.py", line 25, in from blessings import Terminal File "C:\Python37\lib\site-packages\blessings__init.py", line 5, in import curses File "C:\Python37\lib\curses\init__.py", line 13, in from _curses import * ModuleNotFoundError: No module named '_curses'

I'm using windows 10 btw.

11philip22 commented 5 years ago

Hi afterdelight i am currently at work and dont have a windows pc availible to try to duplicate your problem. But i found this: https://github.com/pmbarrett314/curses-menu/issues/18 Apparently Python for windows doesn't ship with the curses library. I am getting rid of the curses output in the twint-utils version and replacing it with a logger since the curses output is currently very buggy.

afterdelight commented 5 years ago

Okay, thanks. After installing curses, I got this: ModuleNotFoundError: No module named 'fcntl'

11philip22 commented 5 years ago

maybe try this module (if you haven't already) windows-curses. you can install it using "pip install windows-curses" If that doesn't work i have another version That uses a logger instead of a curses. but the logger is still not completely finished.

afterdelight commented 5 years ago

I already tried both and didn't work, windows-curses and menu-curses. I'll wait for the logger version. Btw how about the implementation progress with twint?

11philip22 commented 5 years ago

I just fixed the logging version up for you and did a quick testrun on my Windows box using python 3.7 using the modules from the requirements.txt. It should work now. I know the logger outputs double. Thats a bug and i am working on it. If you know how to fix it help is appreciated. If you get to many requests errors try cranking up the sleep timer at line 151. If you are downloading a big number of accounts in bulk and you lose your internet connection or something delete the the resume files of the accounts that are not or not completely downloaded otherwise the twint will start where it exited the privious time and you dont want that because then you skip a large part of the media you want to download. As for the Twint implementation i have to fix the logging and remove the twint function from the twitter class and then it should be done. But i don't have allot of free time because of work. Let me know if it works and il close the issue.

Cheers, Philip

11philip22 commented 5 years ago

I forgot to mention that it's on the logger branch

afterdelight commented 5 years ago

Well, I tried python3 core.py list.txt bit only got a bunch of log files and no medias. Is it a bug?

11philip22 commented 5 years ago

I think i have fixed it. can you give it another try?

afterdelight commented 5 years ago

Got this error now

CRITICAL:root:twint.get:User:'NoneType' object is not subscriptable Traceback (most recent call last): File "core.py", line 104, in t.start() File "D:\twitter.py", line 224, in start self.crawler() File "D:\twitter.py", line 210, in crawler tweets = self.get_tweets(username) File "D:\twitter.py", line 75, in get_tweets twint.run.Search(c) File "C:\Python37\lib\site-packages\twint\run.py", line 292, in Search run(config, callback) File "C:\Python37\lib\site-packages\twint\run.py", line 213, in run get_event_loop().run_until_complete(Twint(config).main(callback)) File "C:\Python37\lib\asyncio\base_events.py", line 584, in run_until_complete return future.result() File "C:\Python37\lib\site-packages\twint\run.py", line 154, in main await task File "C:\Python37\lib\site-packages\twint\run.py", line 198, in run await self.tweets() File "C:\Python37\lib\site-packages\twint\run.py", line 137, in tweets await self.Feed() File "C:\Python37\lib\site-packages\twint\run.py", line 62, in Feed print(self.init, file=open(self.config.Resume, "w", encoding="utf-8")) OSError: [Errno 22] Invalid argument: 'D:\twitter_media_downloader-master\resume\

afterdelight commented 5 years ago

in my list.txt contains one https://twitter.com/twitter/** link per line.

11philip22 commented 5 years ago

oke so i just reverted the changes i just made. I think it was working corect after all. Can you give it another try and check if you see in the output or in the logs Crawling: 'username'

afterdelight commented 5 years ago

Yes, it's crawling but I only got the text logs not the pictures and videos.

11philip22 commented 5 years ago

Ah oke then its working as intended. First it gets all the tweets of the first user. Once the tweets are collected it appends them to a download queue and it starts downloading. After the first account is crawled the crawling and downloading is going simultaneously. With the crawler getting media tweets and appending them to the download queue and the downloader emptying the queue. Just be patient and after it collected the first batch of tweets the downloader spins up. The program is done after it reports "Done downloading!"

11philip22 commented 5 years ago

Once the program is done crawling its first batch and it starts downloading you'll see a message that says: "account: downloading blabla photos and blabla videos crawling nextaccount"

afterdelight commented 5 years ago

Before it's fixed there is no "account: downloading blabla photos and blabla videos" messagge. Still crawling the new one.

11philip22 commented 5 years ago

Does the program end before downloading? Crawling can take hours sometimes depending on the amount of tweets and your internet connection.

afterdelight commented 5 years ago

I got this error and no media just now: INFO:twitter_media_downloader.module:crawling *** CRITICAL:root:twint.get:User:'NoneType' object is not subscriptable 2019-10-03 21:38:15,556 - twitter_media_downloader.module - INFO - 0 items left in crawler queue INFO:twitter_media_downloader.module:0 items left in crawler queue 2019-10-03 21:38:15,556 - twitter_media_downloader.module - INFO - Done crawling! INFO:twitter_media_downloader.module:Done crawling!

afterdelight commented 5 years ago

Thare are 3 folders created: logs, resume and twitter. The twitter one is empty.

afterdelight commented 5 years ago

'twitter_media_downloader.module - INFO - crawling usernamestatusidstatus'

the output log is missing https:// and / for each between username status idstatus

I thinks thats why it was resulting 'CRITICAL:root:twint.get:User:'NoneType' object is not subscriptable' error

11philip22 commented 5 years ago

The twitter folder is where the downloaded media ends up. The resume folder is where twint keeps the resume files. You should remove the resumefiles if you want to restart after an unsuccesfull run. Sometimes twint trows the 'CRITICAL:root:twint.get:User:'NoneType' object is not subscriptable' error. I believe this only affects a single tweet. I just did another test run on my windows pc using python 3.7. I created a text file with the url to twitters twitter page ("https://twitter.com/twitter") (This program only accepts urls to an account page.) This is the output.

PS V:\twitter_media_downloader> python.exe .\core.py .\list.test
2019-10-03 20:17:09,420 - twitter_media_downloader.module - INFO - Starting twitter module with a queue size of 1
2019-10-03 20:17:09,422 - twitter_media_downloader.module - INFO - crawling twitter
2019-10-03 20:18:38,088 - twitter_media_downloader.module - INFO - 0 items left in crawler queue
INFO:twitter_media_downloader.module:0 items left in crawler queue
2019-10-03 20:18:38,088 - twitter_media_downloader.module - INFO - twitter: downloading 311 photos and 1579 videos
INFO:twitter_media_downloader.module:twitter: downloading 311 photos and 1579 videos
2019-10-03 20:18:38,089 - twitter_media_downloader.module - INFO - Done crawling!
INFO:twitter_media_downloader.module:Done crawling!

Within minutes it started downloading media.

afterdelight commented 5 years ago

Well its kinda useless for me because I used twint to scrape the output resluted in sepcific status https://twitter.com/twitteruser/status/xxxxxxxx

Btw does the script support downloading retweet?

11philip22 commented 5 years ago

No it just downloads all the media of an account.

11philip22 commented 5 years ago

I close the issue now because the program is working as intended.