EFForg / badger-sett

Automated training for Privacy Badger. Badger Sett automates browsers to visit websites to produce fresh Privacy Badger tracker data.
https://www.eff.org/badger-pretraining
MIT License
121 stars 15 forks source link

Retry browser restart on failure #22

Closed bcyphers closed 6 years ago

bcyphers commented 6 years ago

Sometimes, the browser will get a "tried to run command without connection" error during the restart-data loading process. This can cause the whole script to crash. e.g.

2018-08-30 02:35:53,876 visiting 410: tinypic.com
2018-08-30 02:37:25,685 tinypic.com WebDriverException: Reached error page: about:neterror?e=netTimeout&u=https%3A//tinypic.com/&c=UTF-8&f=regular&d=The%20server%20at%20tinypic.com%20is%20taking%20too%20long%20to%20respond.
2018-08-30 02:37:26,339 visiting 411: entrepreneur.com
2018-08-30 02:37:44,386 Error loading extension page: Tried to run command without establishing a connection
2018-08-30 02:37:44,387 drupal.org SessionNotCreatedException: Tried to run command without establishing a connection
2018-08-30 02:37:44,387 restarting browser...
Traceback (most recent call last):
  File "./crawler.py", line 291, in crawl
    last_data = dump_data(driver, browser, ext_path)
  File "./crawler.py", line 199, in dump_data
    load_extension_page(driver, browser, ext_path, BACKGROUND)
  File "./crawler.py", line 169, in load_extension_page
    raise err
  File "./crawler.py", line 163, in load_extension_page
    driver.get(ext_url)
  File "/home/bennett/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 326, in get
    self.execute(Command.GET, {'url': url})
  File "/home/bennett/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in execute
    self.error_handler.check_response(response)
  File "/home/bennett/.local/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: Tried to run command without establishing a connection

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./crawler.py", line 397, in <module>
    crawl(**vars(args))
  File "./crawler.py", line 314, in crawl
    driver = restart_browser(last_data)
  File "./crawler.py", line 272, in restart_browser
    load_user_data(driver, browser, ext_path, data)
  File "./crawler.py", line 188, in load_user_data
    driver.execute_script(script, json.dumps(data))
  File "/home/bennett/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 629, in execute_script
    'args': converted_args})['value']
  File "/home/bennett/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in execute
    self.error_handler.check_response(response)
  File "/home/bennett/.local/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.JavascriptException: Message: TypeError: spec is undefined

Scan failed. See log.txt for details.

New code retries the restart action on failure up to 5 times.

bcyphers commented 6 years ago

This is included with #24.