Closed zqxyus closed 2 weeks ago
Duplicate of https://github.com/seleniumbase/SeleniumBase/issues/1912.
When I run the following script, I get the same result as when using a regular Chrome browser, so the Inconsistent
value there isn't accurate.
from seleniumbase import SB
with SB(uc=True, incognito=True, test=True) as sb:
url = "https://antoinevastel.com/bots/"
sb.uc_open_with_reconnect(url, 8)
The https://pixelscan.net/ website is a better test for bots. SeleniumBase UC Mode goes undetected.
from seleniumbase import SB
with SB(uc=True, incognito=True, test=True) as sb:
url = "https://pixelscan.net/"
sb.uc_open_with_reconnect(url, 10)
sb.remove_elements("jdiv") # Remove chat widgets
sb.assert_text("No automation framework detected", "pxlscn-bot-detection")
not_masking = "You are not masking your fingerprint"
sb.assert_text(not_masking, "pxlscn-fingerprint-masking")
sb.highlight("span.text-success", loops=8)
sb.sleep(1)
sb.highlight("pxlscn-fingerprint-masking div", loops=9, scroll=False)
sb.sleep(1)
sb.highlight("div.bot-detection-context", loops=10, scroll=False)
sb.sleep(2)
I used the following code, the access is blocked.
from seleniumbase import SB
with SB(uc=True, incognito=True, test=True) as sb:
url="https://rendezvousparis.hermes.com/client/welcome"
sb.uc_open_with_reconnect(url, 10)
sb.sleep(3)
That page blocked me in my regular Chrome browser (no Selenium). Also, that's not a Cloudflare page. UC Mode is specifically designed for Cloudflare-bypass right now, and some other anti-bot sites.
@mdmintz how to crack it ? would you like to give me any ideas or guidelines ? Thanks !
You can try changing your proxy settings, but otherwise there's not much that can be done if it blocks regular Chrome browsers.
@mdmintz Thank you! I have another question, little information about proxy server setting is found in seleniumbase documentation. The following code is a demo code of proxy server setting based on selenium. If i use seleniumbase, how to set the proxy server?
from selenium import webdriver
def setup_driver():
# ScrapeOps Proxy setup
proxy_url = "proxy.scrapeops.io:8000"
api_key = "YOUR_API_KEY" # Replace this with your ScrapeOps API key
target_url = "http://mywebsite.com/"
bypass_level = "generic_level_1" # Choose the appropriate bypass level
# Set up Selenium with the ScrapeOps Proxy
proxy = f"http://{api_key}:{proxy_url}/?target_url={target_url}&bypass={bypass_level}"
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument(f'--proxy-server={proxy}')
# Initialize the WebDriver
driver = webdriver.Chrome(options=chrome_options)
return driver
def main():
driver = setup_driver()
driver.get(target_url)
#E.g let's extract title of the webpage
print("Page title:", driver.title)
driver.quit()
if __name__ == '__main__':
main()
I first used code similar to the one below to open a website that I need to crawl. But access is blocked and prohibited. So I used the following code to visit https://antoinevastel.com/bots/. The running results show that the following code is detected as a bot.
'''from seleniumbase import SB with SB(uc=True, incognito=True, test=True) as sb: url="https://antoinevastel.com/bots/" server="f.proxys5.net:6200", username= "00007-zone-custom-region-DE-sessid-NkivelA2-sessTime-15",#scrapeops password= "tHx19d0nTan" sb.set_wire_proxy(f"{username}:{password}@{server}") driver=sb.driver.uc_open_with_reconnect(url, 21) sb.sleep(93) '''
PHANTOM_UA | Consistent | {"userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36"}
PHANTOM_PROPERTIES | Consistent | {"attributesFound":[false,false,false]}
PHANTOM_ETSL | Consistent | {"etsl":33}
PHANTOM_LANGUAGE | Consistent | {"languages":["en-US"]}
PHANTOM_WEBSOCKET | Consistent | {}
MQ_SCREEN | Consistent | {}
PHANTOM_OVERFLOW | Consistent | {"depth":9649,"errorMessage":"Maximum call stack size exceeded","errorName":"RangeError","errorStacklength":711}
PHANTOM_WINDOW_HEIGHT | Consistent | {"wInnerHeight":709,"wOuterHeight":840,"wOuterWidth":1280,"wInnerWidth":1236,"wScreenX":80,"wPageXOffset":0,"wPageYOffset":0,"cWidth":1221,"cHeight":812,"sWidth":1920,"sHeight":1080,"sAvailWidth":1850,"sAvailHeight":1053,"sColorDepth":24,"sPixelDepth":24,"wDevicePixelRatio":1}
HEADCHR_UA | Consistent | {"userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36"}
WEBDRIVER | Inconsistent | {}
HEADCHR_CHROME_OBJ | Consistent | {}
HEADCHR_PERMISSIONS | Consistent | {}
HEADCHR_PLUGINS | Consistent | {"plugins":["PDF Viewer::Portable Document Format::internal-pdf-viewer::application/pdf~pdf~Portable Document Format,text/pdf~pdf~Portable Document Format","Chrome PDF Viewer::Portable Document Format::internal-pdf-viewer::application/pdf~pdf~Portable Document Format,text/pdf~pdf~Portable Document Format","Chromium PDF Viewer::Portable Document Format::internal-pdf-viewer::application/pdf~pdf~Portable Document Format,text/pdf~pdf~Portable Document Format","Microsoft Edge PDF Viewer::Portable Document Format::internal-pdf-viewer::application/pdf~pdf~Portable Document Format,text/pdf~pdf~Portable Document Format","WebKit built-in PDF::Portable Document Format::internal-pdf-viewer::__application/pdf~pdf~Portable Document Format,text/pdf~pdf~Portable Document Format"]}
HEADCHR_IFRAME | Consistent | {}
CHR_DEBUG_TOOLS | Consistent | {}
SELENIUM_DRIVER | Consistent | {"attributesFound":[false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false]}
CHR_BATTERY | Consistent | {}
CHR_MEMORY | Consistent | {}
TRANSPARENT_PIXEL | Consistent | {"0":0,"1":0,"2":0,"3":0}
SEQUENTUM | Consistent | {}
VIDEO_CODECS | Consistent | {"h264":"probably"}
How to get around it? thanks!