wkeeling / selenium-wire

Extends Selenium's Python bindings to give you the ability to inspect requests made by the browser.
MIT License
1.9k stars 254 forks source link

Undetected chromedriver v2 not working with selenium wire. #427

Closed yemregundogmus closed 2 years ago

yemregundogmus commented 2 years ago

Hi, I am trying to collect data from opensea.io. However, when I use the undetected chromedriver by itself, I can access the site, but when I use it with selenium-wire, I cannot. What can be the problem?

import undetected_chromedriver.v2 as uc
chrome_options = uc.ChromeOptions() # new solution
chrome_options.add_argument("--window-size=1920,1080")
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--start-maximized")
#chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36")
chrome_options.binary_location = "C:/Program Files/Google/Chrome/Application/chrome.exe"
driver = uc.Chrome(executable_path='chromedriver.exe', options=chrome_options)
driver.get('https://opensea.io/activity/mekaverse')

image

from seleniumwire import webdriver
from seleniumwire.undetected_chromedriver.v2 import Chrome, ChromeOptions

chrome_options = ChromeOptions()
chrome_options.add_argument("--window-size=1920,1080")
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--start-maximized")
#chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36")
chrome_options.binary_location = "C:/Program Files/Google/Chrome/Application/chrome.exe"

#options.add_argument('--headless')
driver = Chrome(options=chrome_options, seleniumwire_options={'disable_encoding': True})
driver.get('https://opensea.io/activity/mekaverse')

image

wkeeling commented 2 years ago

Thanks for raising this.

It's possible that this website is performing TLS fingerprinting of the request and has been able to detect Selenium Wire. TLS fingerprinting is a low level way of identifying whether a client is a real browser or a bot.

Selenium Wire runs traffic through an internal proxy server in order to capture requests, but this means that it presents a different fingerprint to the server that that of regular Selenium/undetected chromedriver. That can then trigger defences such as what you're seeing above.

Although using undetected chromedriver with Selenium Wire provides some shielding against bot detection, it can't currently shield against TLS fingerprinting. There's no solution for this at the present time, although you could try using the Tor browser which I understand will protect against fingerprinting.