elvisyjlin / media-scraper

Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
MIT License
371 stars 49 forks source link

Can't run the program #2

Closed Purefreeman closed 6 years ago

Purefreeman commented 6 years ago

The program doesn't seem to work, and i'm not sure what the possible cause is...

C:\Users\dolap_000\Desktop\media-scraper-master>python -m mediascraper.twitter [3347813921] Starting PhantomJS web driver... C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\phantomjs\webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless ' Logging in as "Dolapofreeman"... Traceback (most recent call last): File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\dolap_000\Desktop\media-scraper-master\mediascraper\twitter.py", line 14, in scraper.login('credentials.json') File "C:\Users\dolap_000\Desktop\media-scraper-master\mediascrapers.py", line 410, in login username.send_keys(credentials['username']) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 479, in send_keys 'value': keys_to_typing(value)}) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 628, in _execute return self._parent.execute(command, params) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 312, in execute self.error_handler.check_response(response) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 237, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.InvalidElementStateException: Message: {"errorMessage":"Element is not currently interactable and may not be manipulated","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"182","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:62282","User-Agent":"Python http auth"},"httpVersion":"1.1","method":"POST","post":"{\"text\": \"Dolapofreeman\", \"value\": [\"D\", \"o\", \"l\", \"a\", \"p\", \"o\", \"f\", \"r\", \"e\", \"e\", \"m\", \"a\", \"n\"], \"id\": \":wdc:1523809525391\", \"sessionId\": \"96a881f0-40c9-11e8-a71a-c10ae27e62ae\"}","url":"/value","urlParsed":{"anchor":"","query":"","file":"value","directory":"/","path":"/value","relative":"/value","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/value","queryKey":{},"chunks":["value"]},"urlOriginal":"/session/96a881f0-40c9-11e8-a71a-c10ae27e62ae/element/:wdc:1523809525391/value"}} Screenshot: available via screen

Purefreeman commented 6 years ago

still can't use it

elvisyjlin commented 6 years ago

Hi @Purefreeman,

Thank you for using my tool!

I found a little bug and fixed it for the error you found.

However, I'm not sure which account you would like to scrape. For a Twitter account, the arguments are supposed to be usernames (without brackets []). And 3347813921 does not seem to be a user name.

For example, to scrape the account Twitter (https://twitter.com/Twitter)

python3 -m mediascraper.twitter Twitter

Please pull this repo for updates and test media-scraper with this command. Thanks!

Purefreeman commented 6 years ago

Thank you, but unfortunately i am still getting the same error

python -m mediascraper.twitter purefreeman Starting PhantomJS web driver... Web driver ".\webdriver/phantomjsdriver_2.1.1_win32/phantomjs.exe" not found. Start downloading the web driver... Web driver ".\webdriver/phantomjsdriver_2.1.1_win32/phantomjs.exe" has been downloaded successfully. C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\phantomjs\webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless ' Logging in as "Kheshig"... Traceback (most recent call last): File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\dolap_000\Desktop\media-scraper-master\mediascraper\twitter.py", line 14, in scraper.login('credentials.json') File "C:\Users\dolap_000\Desktop\media-scraper-master\mediascrapers.py", line 415, in login username.send_keys(credentials['username']) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 479, in send_keys 'value': keys_to_typing(value)}) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 628, in _execute return self._parent.execute(command, params) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 312, in execute self.error_handler.check_response(response) File "C:\Users\dolap_000\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 237, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.InvalidElementStateException: Message: {"errorMessage":"Element is not currently interactable and may not be manipulated","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"146","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:53380","User-Agent":"Python http auth"},"httpVersion":"1.1","method":"POST","post":"{\"text\": \"Kheshig\", \"value\": [\"K\", \"h\", \"e\", \"s\", \"h\", \"i\", \"g\"], \"id\": \":wdc:1527762097196\", \"sessionId\": \"62fe91c0-64bc-11e8-a9aa-83661e939ba1\"}","url":"/value","urlParsed":{"anchor":"","query":"","file":"value","directory":"/","path":"/value","relative":"/value","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/value","queryKey":{},"chunks":["value"]},"urlOriginal":"/session/62fe91c0-64bc-11e8-a9aa-83661e939ba1/element/:wdc:1527762097196/value"}} Screenshot: available via screen

elvisyjlin commented 6 years ago

I'm sorry for the login error in Twitter. That was another bug.

The reason of this exception is that Twitter places 3 forms in the login page. 2 of them, however, are not able to input values. I've update the login function to locate to correct form and test it by logging in my personal account.

Please test it again, @Purefreeman . Thank you for your patience.

Purefreeman commented 6 years ago

No, thank you so much. Thank you for this wonderful tool. it works now.