tzuhsial / InstagramCrawler

A non API python program to crawl public photos, posts or followers
https://github.com/iammrhelo/InstagramCrawler
MIT License
373 stars 108 forks source link

BadStatusLine: '' exceptions #2

Open makusu2005 opened 7 years ago

makusu2005 commented 7 years ago

Hi Guys,

Thanks for the great project which I use to get followers of a user. This works in about 50% of the cases, but sometimes I get the following error:

Traceback (most recent call last):
  File "crawler_tuintjedelen.py", line 364, in main
    crawler.browse(args.query,args.type).crawl(args.number,args.caption).save()
  File "crawler_tuintjedelen.py", line 160, in crawl
    self.followlist = self._crawl_follow()
  File "crawler_tuintjedelen.py", line 319, in _crawl_follow
    self.driver.execute_script(SCROLL_DOWN)
  File "/home/makusu/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 465, in execute_script
    'args': converted_args})['value']
  File "/home/makusu/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 234, in execute
    response = self.command_executor.execute(driver_command, params)
  File "/home/makusu/.local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 408, in execute
    return self._request(command_info[0], url, body=data)
  File "/home/makusu/.local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 478, in _request
    resp = opener.open(request, timeout=self._timeout)
  File "/usr/lib/python2.7/urllib2.py", line 429, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 447, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1228, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1201, in do_open
    r = h.getresponse(buffering=True)
  File "/usr/lib/python2.7/httplib.py", line 1136, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 453, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 417, in _read_status
    raise BadStatusLine(line)
BadStatusLine: ''

Exception urllib2.URLError: URLError(error(111, 'Connection refused'),) in <bound method InstagramCrawler.__del__ of <__main__.InstagramCrawler object at 0x7f81e3cf7bd0>> ignored

Any idea what could be going on? Running it on Ubuntu with PhantomJS.

Thanks!