flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
836 stars 179 forks source link

Chromedriver issue with Docker #99

Closed stormDE closed 2 years ago

stormDE commented 3 years ago

Hi People,

I Try to start a docker contatiner on a Ubuntu Server without GUI. the programm Crash if i enable anticaptcha, ERROR:

used IMAGE: oyzoursky/python-chromedriver:3.8-selenium

Traceback (most recent call last): File "flathunt.py", line 89, in main() File "flathunt.py", line 68, in main config = Config(config_handle.name) File "/app/flathunter/config.py", line 29, in init self.searchers = [CrawlImmobilienscout(self), File "/app/flathunter/crawl_immobilienscout.py", line 43, in init self.driver = self.configure_driver(self.driver_executable_path, self.driver_arguments) File "/app/flathunter/abstract_crawler.py", line 51, in configure_driver driver = webdriver.Chrome(executable_path=driver_path, options=chrome_options) File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/chrome/webdriver.py", line 76, in init RemoteWebDriver.init( File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in init self.start_session(capabilities, browser_profile) File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute self.error_handler.check_response(response) File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally (unknown error: DevToolsActivePort file doesn't exist) (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)

used image: oyzoursky/python-chromedriver:3.8 :

raceback (most recent call last): File "flathunt.py", line 89, in main() File "flathunt.py", line 68, in main config = Config(config_handle.name) File "/app/flathunter/config.py", line 29, in init self.searchers = [CrawlImmobilienscout(self), File "/app/flathunter/crawl_immobilienscout.py", line 43, in init self.driver = self.configure_driver(self.driver_executable_path, self.driver_arguments) File "/app/flathunter/abstract_crawler.py", line 51, in configure_driver driver = webdriver.Chrome(executable_path=driver_path, options=chrome_options) File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/chrome/webdriver.py", line 76, in init RemoteWebDriver.init( File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in init self.start_session(capabilities, browser_profile) File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute self.error_handler.check_response(response) File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally (unknown error: DevToolsActivePort file doesn't exist) (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)

can anyone help me ?

tldev-de commented 3 years ago

Hi @stormDE

I had the same problem running with the given Dockerfile. The problem is that the chrome sandbox feature does not work in docker environment. You have to disable the sandbox in the config.yaml file like this:

captcha:
  api_key: XXXX
  driver_path: /usr/local/bin/chromedriver
  driver_arguments:
    - "--headless"
    - "--no-sandbox"

Good luck!

stormDE commented 3 years ago

thanks much for your advice!

i try but now i get this error:

Traceback (most recent call last): File "flathunt.py", line 89, in main() File "flathunt.py", line 86, in main launch_flat_hunt(config) File "flathunt.py", line 46, in launch_flat_hunt hunter.hunt_flats() File "/app/flathunter/hunter.py", line 42, in hunt_flats for expose in processor_chain.process(self.crawl_for_exposes(max_pages)): File "/app/flathunter/hunter.py", line 21, in crawl_for_exposes return chain([searcher.crawl(url, max_pages) File "/app/flathunter/hunter.py", line 21, in return chain([searcher.crawl(url, max_pages) File "/app/flathunter/abstract_crawler.py", line 136, in crawl return self.get_results(url, max_pages) File "/app/flathunter/crawl_immobilienscout.py", line 60, in get_results soup = self.get_page(search_url, self.driver, page_no) File "/app/flathunter/crawl_immobilienscout.py", line 120, in get_page return self.get_soup_from_url(search_url.format(page_no), driver=driver, captcha_api_key=self.captcha_api_key, checkbox=self.checkbox, afterlogin_string=self.afterlogin_string) File "/app/flathunter/abstract_crawler.py", line 75, in get_soup_from_url self.resolvecaptcha(driver, checkbox, afterlogin_string, captcha_api_key) File "/app/flathunter/abstract_crawler.py", line 153, in resolvecaptcha self._solve(driver, api_key) File "/app/flathunter/abstract_crawler.py", line 168, in _solve captcha_id = session.post(postrequest).text.split("|")[1] IndexError: list index out of range

any ideas ?

codders commented 2 years ago

@stormDE Closing this because it was a while ago. I hope you were able to resolve your issue (and find a flat! :))

stormDE commented 2 years ago

finaly it should work, thanks. The last traceback was an error on my side it was something miss configured in my configfile.