ultrafunkamsterdam / undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
https://github.com/UltrafunkAmsterdam/undetected-chromedriver
GNU General Public License v3.0
10.13k stars 1.17k forks source link

How to use UC on google colaboratory? #1508

Open MattNewell-04 opened 1 year ago

MattNewell-04 commented 1 year ago

I'm trying to write a program which checks https://www.surfline.com/ every day and lets me know if there's good surf. When using headless selenium webdriver from my computer, it fails security, but using headless undetected chromedriver (uc) works.

However, I've been unable to get uc working on Colab. This is my code:

!apt-get update
!pip install selenium
!pip install undetected-chromedriver
!apt-get install chromium-browser chromium-chromedriver
!apt-get upgrade

from selenium import webdriver
import undetected_chromedriver as uc

options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--headless')

driver = uc.Chrome(options=options)

driver.get("https://surfline.com")
print(driver.page_source)

driver.quit()

Error message:

WebDriverException: Message: unknown error: cannot connect to chrome at 127.0.0.1:47329
from chrome not reachable

Traceback:

WebDriverException                        Traceback (most recent call last)
<ipython-input-9-6646ba55ac94> in <cell line: 15>()
     13 
     14 # driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
---> 15 driver = uc.Chrome(options=options)
     16 
     17 driver.get("https://surfline.com")

7 frames
/usr/local/lib/python3.10/dist-packages/undetected_chromedriver/__init__.py in __init__(self, options, user_data_dir, driver_executable_path, browser_executable_path, port, enable_cdp_events, desired_capabilities, advanced_elements, keep_alive, log_level, headless, version_main, patcher_force_close, suppress_welcome, use_subprocess, debug, no_sandbox, user_multi_procs, **kw)
    464         )
    465 
--> 466         super(Chrome, self).__init__(
    467             service=service,
    468             options=options,

/usr/local/lib/python3.10/dist-packages/selenium/webdriver/chrome/webdriver.py in __init__(self, options, service, keep_alive)
     43         options = options if options else Options()
     44 
---> 45         super().__init__(
     46             DesiredCapabilities.CHROME["browserName"],
     47             "goog",

/usr/local/lib/python3.10/dist-packages/selenium/webdriver/chromium/webdriver.py in __init__(self, browser_name, vendor_prefix, options, service, keep_alive)
     54 
     55         try:
---> 56             super().__init__(
     57                 command_executor=ChromiumRemoteConnection(
     58                     remote_server_addr=self.service.service_url,

/usr/local/lib/python3.10/dist-packages/selenium/webdriver/remote/webdriver.py in __init__(self, command_executor, keep_alive, file_detector, options)
    204         self._authenticator_id = None
    205         self.start_client()
--> 206         self.start_session(capabilities)
    207 
    208     def __repr__(self):

/usr/local/lib/python3.10/dist-packages/undetected_chromedriver/__init__.py in start_session(self, capabilities, browser_profile)
    722         if not capabilities:
    723             capabilities = self.options.to_capabilities()
--> 724         super(selenium.webdriver.chrome.webdriver.WebDriver, self).start_session(
    725             capabilities
    726         )

/usr/local/lib/python3.10/dist-packages/selenium/webdriver/remote/webdriver.py in start_session(self, capabilities)
    288 
    289         caps = _create_caps(capabilities)
--> 290         response = self.execute(Command.NEW_SESSION, caps)["value"]
    291         self.session_id = response.get("sessionId")
    292         self.caps = response.get("capabilities")

/usr/local/lib/python3.10/dist-packages/selenium/webdriver/remote/webdriver.py in execute(self, driver_command, params)
    343         response = self.command_executor.execute(driver_command, params)
    344         if response:
--> 345             self.error_handler.check_response(response)
    346             response["value"] = self._unwrap_value(response.get("value", None))
    347             return response

/usr/local/lib/python3.10/dist-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
    227                 alert_text = value["alert"].get("text")
    228             raise exception_class(message, screen, stacktrace, alert_text)  # type: ignore[call-arg]  # mypy is not smart enough here
--> 229         raise exception_class(message, screen, stacktrace)

Stacktrace:

#0 0x58214c3364e3 <unknown>
#1 0x58214c065b00 <unknown>
#2 0x58214c053436 <unknown>
#3 0x58214c0929be <unknown>
#4 0x58214c08a884 <unknown>
#5 0x58214c0c9ccc <unknown>
#6 0x58214c0c947f <unknown>
#7 0x58214c0c0de3 <unknown>
#8 0x58214c0962dd <unknown>
#9 0x58214c09734e <unknown>
#10 0x58214c2f63e4 <unknown>
#11 0x58214c2fa3d7 <unknown>
#12 0x58214c304b20 <unknown>
#13 0x58214c2fb023 <unknown>
#14 0x58214c2c91aa <unknown>
#15 0x58214c31f6b8 <unknown>
#16 0x58214c31f847 <unknown>
#17 0x58214c32f243 <unknown>
#18 0x7d4c51499b43 <unknown>

I've tried most of the suggestions here https://github.com/ultrafunkamsterdam/undetected-chromedriver/issues/743 which has the same error message, but consider this a different issue since I'm talking specifically about using Colab.

Open to suggestions of other platforms I should move onto if this may be a Colab-related issue.

ArtemBernatskyy commented 11 months ago

Have you been able to figure it out?

jpjacobpadilla commented 11 months ago

Have you been able to figure it out?

https://github.com/jpjacobpadilla/Google-Colab-Selenium