ultrafunkamsterdam / undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
https://github.com/UltrafunkAmsterdam/undetected-chromedriver
GNU General Public License v3.0
9.15k stars 1.1k forks source link

Kasada.io detecting undetected-chromedriver #1498

Open sagunsh opened 10 months ago

sagunsh commented 10 months ago

Kasada.io is detecting undetected chromedriver.

url: https://www.realestate.com.au/sold/list-1?source=refinement

Using Python 3.8 and undetected-chromedriver 3.5

It used to work before. I believe there has been some changes in the detection algorithm. Also nike.com uses Kasada but it seems to work fine for nike.

mdmintz commented 10 months ago

I'm avoiding detection on that site with https://github.com/seleniumbase/SeleniumBase in UC Mode (version 4.17.9):

pip install -U seleniumbase and then run with python:

from seleniumbase import DriverContext

with DriverContext(uc=True) as driver:
    driver.get("https://www.realestate.com.au/sold/list-1?source=refinement")
    import pdb; pdb.set_trace()

Pauses at the breakpoint. c + Enter to continue from the breakpoint.

Here's another format you can use:

from seleniumbase import Driver

driver = Driver(uc=True)
driver.get("https://www.realestate.com.au/sold/list-1?source=refinement")
import pdb; pdb.set_trace()
driver.quit()

SeleniumBase

Romhast commented 10 months ago

I'm avoiding detection on that site with https://github.com/seleniumbase/SeleniumBase in UC Mode (version 4.17.9):

pip install -U seleniumbase and then run with python:

from seleniumbase import DriverContext

with DriverContext(uc=True) as driver:
    driver.get("https://www.realestate.com.au/sold/list-1?source=refinement")
    import pdb; pdb.set_trace()

Pauses at the breakpoint. c + Enter to continue from the breakpoint.

Here's another format you can use:

from seleniumbase import Driver

driver = Driver(uc=True)
driver.get("https://www.realestate.com.au/sold/list-1?source=refinement")
import pdb; pdb.set_trace()
driver.quit()

SeleniumBase

how do i use chrome_options for seleniumbase

mdmintz commented 10 months ago

@Romhast Args are listed in https://github.com/seleniumbase/SeleniumBase/blob/master/seleniumbase/plugins/driver_manager.py if you scroll down a little. Use a comma-separated list without spaces for chromium_arg to pass in Chrome options that aren't listed. Most selenium-specific Chromium args will lead to detection, so you probably don't want to add any that you don't need.

sagunsh commented 10 months ago

@mdmintz I am getting this error

The chromedriver version (114.0.5735.90) detected in PATH at projectdir/venv/lib/python3.8/site-packages/seleniumbase/drivers/chromedriver might not be compatible with the detected chrome version (116.0.5845.96); currently, chromedriver 116.0.5845.96 is recommended for chrome 116.*, so it is advised to delete the driver in PATH and retry

I believe it downloads a chromedriver automatically. Is it possible to specify a compatible version or handle this when downloading the driver by checking existing chrome version in the system? I believe undetected-chromedriver has a parameter called version_main. Looking for something like that

mdmintz commented 10 months ago

@sagunsh How are you initiating your driver? That message is coming from selenium, not seleniumbase. See https://github.com/SeleniumHQ/selenium/blob/9163aea829669ad844285098783a11f772b445af/rust/src/lib.rs#L509

When using seleniumbase in UC Mode, it downloads chromedriver and renames it to uc_driver.

The raw selenium way of handling that right now is by deleting old drivers, but you shouldn't have to do that if using seleniumbase.

sagunsh commented 10 months ago

@mdmintz

I copied the exact same thing and it worked on machine 1 but didn't on machine 2. Both are using Ubuntu 22.04 and same python, seleniumbase version.

Then I tried this:

driver = Driver(uc=True, driver_version=116)

Now I am getting this error:

Traceback (most recent call last):
  File "seleniumbase_test.py", line 3, in <module>
    driver = Driver(uc=True, driver_version=116)
  File "/home/project/venv/lib/python3.8/site-packages/seleniumbase/plugins/driver_manager.py", line 425, in Driver
    driver = browser_launcher.get_driver(
  File "/home/project/venv/lib/python3.8/site-packages/seleniumbase/core/browser_launcher.py", line 1358, in get_driver
    return get_local_driver(
  File "/home/project/venv/lib/python3.8/site-packages/seleniumbase/core/browser_launcher.py", line 3220, in get_local_driver
    driver = undetected.Chrome(
  File "/home/project/venv/lib/python3.8/site-packages/seleniumbase/undetected/__init__.py", line 222, in __init__
    options.binary_location = (
  File "/home/project/venv/lib/python3.8/site-packages/selenium/webdriver/chromium/options.py", line 55, in binary_location
    raise TypeError(self.BINARY_LOCATION_ERROR)
TypeError: Binary Location Must be a String
mdmintz commented 10 months ago

@sagunsh That means Chrome wasn't installed on that machine. Chrome must exist there first before you can run scripts.

tehneydobertz commented 10 months ago

how to learn code to use selenium Base

davesc63 commented 7 months ago

So I am trying to automate scraping some account data from Origin Energy. I started with the uc=True example here

driver = Driver(uc_cdp=True, headless2=True)
driver.get("https://www.originenergy.com.au/my")

This worked. Initially.

I built out my app and it was working as expected from PyCharm

I moved the python script to another machine, ran the app and it did not work. I think kadasa keystroke sdk detecting scraping.

I then ran the original script in pycharm that was working and now it also has stopped.

What might be causing issues and is there a more successful way of avoiding these lockouts?