ultrafunkamsterdam / undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
https://github.com/UltrafunkAmsterdam/undetected-chromedriver
GNU General Public License v3.0
10.13k stars 1.17k forks source link

Captcha is shown after chrome driver upgrade to 103. #714

Open arundhatid opened 2 years ago

arundhatid commented 2 years ago

We are running automated test deployed as a k8s cron job in GCP cluster. Our application has IAP enabled. We have used undetected-chromedriver and applied all the fixes which we could find to make sure that we avoid bot detection. So far we have been successful. But since chrome driver version is upgraded to 103 , test is getting detected as bot and captch is displayed as soon as we enter email id .

Please note that we use headless chrome( yes, I read the readme file and I know its WIP but I am testing my luck here :) ). We use the same python+selenium script to run in our CI/CD pipeline and there it runs fine.

Following is the snippet of my docker file which is used to deploy test as k8s cron job in GCP cluster.

FROM python:3.7.9
RUN python3 -m pip install jsonpath-ng==1.4.3
RUN python3 -m pip install robotframework==4.1.1
RUN python3 -m pip install RESTinstance==1.0.2
RUN python3 -m pip install selenium==3.141.0
RUN python3 -m pip install webdriver-manager==2.4.0
RUN python3 -m pip install google-cloud-datastore==1.12.0
RUN python3 -m pip install google-cloud-storage==1.30.0
RUN python3 -m pip install azure-devops==6.0.0b2
RUN python3 -m pip install msrest==0.6.13
RUN python3 -m pip install keyboard==0.13.5
RUN python3 -m pip install onetimepass==1.0.1
RUN python3 -m pip install robotframework-jsonlibrary==0.3.1
RUN python3 -m pip install google.cloud==0.34.0
RUN python3 -m pip install undetected_chromedriver==2.0.0
RUN python3 -m pip install prometheus-client==0.14.1
RUN python3 -m pip install --upgrade google-cloud-storage
RUN python3 -m pip install selenium-stealth
RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | tee /usr/share/keyrings/cloud.google.gpg && apt-get update -y && apt-get install google-cloud-sdk -y

ENV exclude_tag exclude_tag
ENV include_tag include_tag

# install latest google chrome
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
RUN apt-get -y update
RUN apt-get install -y google-chrome-stable

RUN sed -i -e 's/\r$//' /bin/start.sh
ENTRYPOINT ["/bin/bash", "-c", "/bin/start.sh"]

Following is python + selenium script to launch application and the code used to lanuch chrome driver:

import time
import base64
import os
import logging
import undetected_chromedriver as uc
import platform
from selenium import webdriver
from selenium.webdriver.common.by import By
from robot.api import logger
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from gcs_bucket import copy_files_to_bucket
from selenium_stealth import stealth
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service

def set_chrome_driver(options, postfix):
   driver = uc.Chrome(options=options)

   str1 = driver.capabilities['browserVersion']
   str2 = driver.capabilities['chrome']['chromedriverVersion'].split(' ')[
          0]
   logging.warning(
          f"The chrome version you are using is {str1} and chromedriver version you are using is {str2}")

    stealth(driver,
            user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Safari/537.36",
            languages=["en-US", "en"],
            vendor="Google Inc.",
            platform="Win32",
            webgl_vendor="Intel Inc.",
            renderer="Intel Iris OpenGL Engine",
            fix_hairline=True
            )

    driver.delete_all_cookies()
    driver.get(os.getenv('URL') + postfix)
    time.sleep(2)

    return driver

def set_chrome_options():
    options = uc.ChromeOptions()
    options.add_argument("--headless")
    options.add_argument("--incognito")
    options.add_argument("--disable-gpu")
    options.add_argument("--lang=en-US")
    options.add_argument("--disable-extensions")
    options.add_argument("--disable-popup-blocking")
    options.add_argument("--profile-directory=Default")
    options.add_argument("--ignore-certificate-errors")
    options.add_argument("--disable-plugins-discovery")
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-dev-shm-usage")
    options.add_argument("--no-first-run")
    options.add_argument("--no-service-autorun")
    options.add_argument("--no-default-browser-check")
    #options.add_argument("--disable-blink-features")
    options.add_argument("--disable-blink-features=AutomationControlled")
    options.add_argument("--disable-infobars")
    #options.add_argument("--window-size=1920,1080")
    options.add_argument("--window-size=1100,1000")

    #commented as we are using latest stable version of undetected_chromedriver which 
    #throw an error on below statements. Hence commented. undetected_chromedriver
    #upgraded from 2.0.0 to 3.1.3 version

    #options.add_experimental_option("excludeSwitches", ["enable-automation"])   
    #options.add_experimental_option("useAutomationExtension", False)

    options.add_argument(
        "user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Safari/537.36")

    return options

def login(selector, driver, user, password):

    # # Add Sleep to avoid detection
    time.sleep(0.5)

    driver.save_screenshot("./reports/screenshots/img1.png")

    try:
        WebDriverWait(driver, 30).until(EC.element_to_be_clickable(
            (By.CSS_SELECTOR, "input[type='email']")), "input[type='email'] not visible and clickable").send_keys(user+"\n")
        logging.info("Email ID entered")
    except Exception as e:
        driver.save_screenshot("./reports/screenshots/img2.png")
        logging.error("input[type='email'] not visible and clickable")
        raise

    # Add Sleep to avoid detection
    time.sleep(1)

    ### failing here itself as captcha is displayed

    try:
        WebDriverWait(driver, 30).until(EC.element_to_be_clickable(
            (By.ID, "userNameInput")), "User name field not visible and clickable").send_keys(user)
        logging.info("userName entered")
    except Exception as e:
        driver.save_screenshot("./reports/screenshots/img3.png")     
        raise

    # Add Sleep to avoid detection
    time.sleep(0.5)

    try:
        WebDriverWait(driver, 30).until(EC.element_to_be_clickable(
            (By.ID, "passwordInput")), "Password Input not visible and clickable").send_keys(base64.b64decode(password).decode())
        logging.info("password entered")
    except Exception as e:
        driver.save_screenshot("./reports/screenshots/img4.png")
        raise

    # Add Sleep to avoid detection
    time.sleep(0.5)

    try:
        WebDriverWait(driver, 30).until(
            EC.element_to_be_clickable((By.ID, "submitButton")), "Submit Button not visible and clickable").click()
        logging.info("submit button clicked")
    except Exception as e:
        driver.save_screenshot("./reports/screenshots/img4.png")
        logging.error("Submit Button not visible and clickable")

        raise

    # Add Sleep to avoid detection
    time.sleep(0.5)

def login_application(user, password):
    try:
        options = set_options()
        driver = set_driver(options)
        login(driver, user, password)

    except Exception as e:
        logger.console(e)

login_application("username","password")     

Is there anything I can try other than this to

sebdelsol commented 2 years ago

I don't think it has anything to do with v103. Correlation is not causation.

You're playing the headless-cat-n-mouse game and now you've lost. I don't think it's a game worth playing : Since neither the mouse nor the cat have any incentives to open-source their efforts, you're on your own especially if you don't give a minimum working example.

Selenium-stealth implement basic known evasions but it's 2 years old and does one thing wrong : It deletes navigator.webdriver , that's a red flag ... At least UC does it properly but you're overriding it.

btw I'm quite sure spoofing a Windows user agent on a Linux system is easily detected too.