seleniumbase / SeleniumBase

📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
https://seleniumbase.io
MIT License
4.46k stars 909 forks source link

`headless2=True` is detectable #2733

Closed kaliiiiiiiiii closed 2 months ago

kaliiiiiiiiii commented 2 months ago

due to adding --user-agent= at https://github.com/seleniumbase/SeleniumBase/commit/044fc46eba9237876d4e406434411bc78192a270 (supposed to fix https://github.com/seleniumbase/SeleniumBase/issues/2523), some other device metrics get set to undefined

script to reproduce

import seleniumbase
print(seleniumbase.__version__)

def run(headless=False):
    with seleniumbase.SB(uc=True, headless2=headless) as sb:
        platform_version = sb.execute_async_script("""
        navigator.userAgentData.getHighEntropyValues(
                [
                    "architecture",
                    "model",
                    "platform",
                    "platformVersion",
                    "fullVersionList",
                  ]).then((data) => {
                    arguments[0](data.platformVersion)
                  });
        """)
        print(f'headless:{headless}, PlatformVersion:"{platform_version}"')

run(headless=True)
run()

prints

4.26.0
headless:True, PlatformVersion:"""
headless:False, PlatformVersion:""10.0.0"

tested on

Note: other values are affected as well, this is just an example

mdmintz commented 2 months ago

This script is still bypassing Cloudflare for me with headless2=True:

from seleniumbase import SB

with SB(uc=True, headless2=True, test=True) as sb:
    url = "https://gitlab.com/users/sign_in"
    sb.driver.uc_open_with_reconnect(url, 3)
    if not sb.is_text_visible("Username", '[for="user_login"]'):
        sb.driver.uc_open_with_reconnect(url, 4)
    sb.assert_text("Username", '[for="user_login"]', timeout=3)
    sb.assert_element('label[for="user_login"]')
    sb.highlight('button:contains("Sign in")')
    sb.highlight('h1:contains("GitLab.com")')
    sb.post_message("SeleniumBase wasn't detected", duration=4)

If for whatever reason headless2=True isn't enough, you can always use SB(uc=True, xvfb=True) to get UC Mode working on a headless display without headless mode by using a virtual display:

from seleniumbase import SB

with SB(uc=True, xvfb=True, test=True) as sb:
    sb.driver.uc_open_with_reconnect("https://pixelscan.net", 10)