Open a-ion314 opened 5 years ago
Hi, I cannot reproduce your error. I ran python3 -m mediascraper.twitter TwitterUser
and got the following results
Starting PhantomJS web driver...
./webdriver/phantomjsdriver_2.1.1_linux64/phantomjs
/home/elvis/.local/lib/python3.5/site-packages/selenium/webdriver/phantomjs/webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead
warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless '
Either username or password is empty. Abort login.
Crawling...
10 media are found.
Downloading...
50%|███████████████████████████ | 5/10 [00:04<00:04, 1.17it/s]The file download/twitter/TwitterUser/BYRkNbhCQAAaPzj.jpg exists. Skip it.
The file download/twitter/TwitterUser/BYRjjYyCIAE15FC.jpg exists. Skip it.
The file download/twitter/TwitterUser/BYRjCCECEAAuOhr.jpg exists. Skip it.
The file download/twitter/TwitterUser/BYRik0ICIAAHb57.jpg exists. Skip it.
The file download/twitter/TwitterUser/BYFE7p9CQAAqVaA.jpg exists. Skip it.
100%|█████████████████████████████████████████████████████| 10/10 [00:04<00:00, 2.36it/s]
And I have 5 pictures under ls download/twitter/TwitterUser/
: BYFE7p9CQAAqVaA.jpg, BYRjCCECEAAuOhr.jpg, BYRkNbhCQAAaPzj.jpg, BYRik0ICIAAHb57.jpg, BYRjjYyCIAE15FC.jpg.
If you could provide me more information, I can then help you.
Sorry, should have included the twitter user. I obtained this specific error when running it against NerdCity's twitter. So the command i ran was:
python3 -m mediascraper.twitter nerdcity
When running the following command:
python3 -m mediascraper.twitter nerdcity
I get the following error:
Starting PhantomJS web driver... ./webdriver/phantomjsdriver_2.1.1_linux64/phantomjs /home/User/.local/lib/python3.6/site-packages/selenium/webdriver/phantomjs/webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless ' Traceback (most recent call last): File "/home/User/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen chunked=chunked) File "/home/User/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 384, in _make_request six.raise_from(e, None) File "", line 2, in raise_from
File "/home/User/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.6/http/client.py", line 1331, in getresponse
response.begin()
File "/usr/lib/python3.6/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.6/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/User/Desktop/git/media-scraper/mediascraper/twitter.py", line 18, in
tasks = scraper.scrape(username)
File "/home/User/Desktop/git/media-scraper/mediascrapers.py", line 379, in scrape
self._connect('{}/{}/media'.format(self.base_url, username))
File "/home/User/Desktop/git/media-scraper/mediascrapers.py", line 51, in _connect
self._driver.get(url)
File "/home/User/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "/home/User/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 319, in execute
response = self.command_executor.execute(driver_command, params)
File "/home/User/.local/lib/python3.6/site-packages/selenium/webdriver/remote/remote_connection.py", line 374, in execute
return self._request(command_info[0], url, body=data)
File "/home/User/.local/lib/python3.6/site-packages/selenium/webdriver/remote/remote_connection.py", line 402, in _request
resp = http.request(method, url, body=body, headers=headers)
File "/home/User/.local/lib/python3.6/site-packages/urllib3/request.py", line 72, in request
urlopen_kw)
File "/home/User/.local/lib/python3.6/site-packages/urllib3/request.py", line 150, in request_encode_body
return self.urlopen(method, url, extra_kw)
File "/home/User/.local/lib/python3.6/site-packages/urllib3/poolmanager.py", line 324, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "/home/User/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/User/.local/lib/python3.6/site-packages/urllib3/util/retry.py", line 368, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/User/.local/lib/python3.6/site-packages/urllib3/packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/home/User/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/home/User/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 384, in _make_request
six.raise_from(e, None)
File "", line 2, in raise_from
File "/home/User/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.6/http/client.py", line 1331, in getresponse
response.begin()
File "/usr/lib/python3.6/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.6/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))