Open MacMarde opened 2 years ago
It would help for the headless mode which has barely no evasions (and from this package code it seems to be a bunch of js to evade headless detection ?). I might check a little more latter...
If you don’t use headless undetected-chrome spawn a regular Chrome before the driver and patch the driver so it’s very hard to detect (I don’t know about ways to detect it but I’m sure there are).
@sebdelsol Honestly I do not know what this package does in detail. But to use it you need to pass the python webdriver object to it. That's all. But I do not know how it inteferes with the uc webdriver object.
If you don’t use headless undetected-chrome spawn a regular Chrome before the driver and patch the driver so it’s very hard to detect (I don’t know about ways to detect it but I’m sure there are).
I do not use headless chrome, but I do not understand what you mean.
@sebdelsol Can you please explain?
if you don't use headless chrome then selenium-stealth javascript tricks which mock an non-headless chrome are not that useful. (but I need to read its code more thoroughly to be sure).
Undetected-chrome driver relies on two simple effective tricks to hide it's Selenium based :
But you don't wan't to use the headless option : headless Chrome behaves very differently than regular Chrome and there are many ways to detect it with simple client-side javascript. Anyway selenium-stealth has some evasion techniques I was not aware of. But this package is 2 year old and anti-bots company keeps finding new techniques. This is an endless cat & mouse game.
tldr : Non-headless undetected-chromedriver is a better strategy than relying on headless Chrome evasion techniques based on a mere chromedriver spawned by Selenium.
EDIT : the new game is not about having a stealth driver (undetected-chromedriver still works well), but it's all about
@sebdelsol Thank you very much for your detailed answer.
Undetected-chrome driver relies on two simple effective tricks to hide it's Selenium based :
it spawns Chrome as a detached process so it behaves like your regular Chrome as much as possible. it patches the driver so that there are no detectable variables left. no need for further javascript injection !
Thank you for that. I was not aware of this.
But you don't wan't to use the headless option : headless Chrome behaves very differently than regular Chrome and there are many ways to detect it with simple client-side javascript. Anyway selenium-stealth has some evasion techniques I was not aware of. But this package is 2 year old and anti-bots company keeps finding new techniques. This a cat & mouse game.
I am not using headless chrome, but you are right, selenium-stealth is basically about hiding headless chrome as described here: https://intoli.com/blog/making-chrome-headless-undetectable/ But anyware there are some features that may also be important for non-headless chrome. You can test it here and also here. Anyway I am not sure about how it works. But we can not let the cats win ;-)
EDIT : the new game is not about having a stealth driver (undetected-chromedriver still works well), but it's all about
fingerprinting to allow rate limitation : when I can ID you, I can limit your usage of the site to what a human need. The detection of bot behavior vs regular human behavior : If you don't behaves like a human I serve you a captcha or worse... Some techniques to prevent easy scrapping (shadow-root, obfuscated DOM xpath)
I am aware of these technics and trying to get rid of them as well.
I'm just an amateur (and find this game fascinating). I've my own pet project that (fairly) scrape some sites for fun.
Anyway I've had to extend the Selenium ActionChains
class to add some basic "human like" actions : random pauses based on actual human reaction time, keys send one by one, mouse move that takes time to go to a point (useful for sliders). I've even seen some people using Bézier curve + random noise to make their mouse movements even more human... this is an endless endeavor... good luck with your project(s).
I do not know much about web scraping and what it is good for.
But I have used some bots in the past to make money and for other things. At least it was working for some time. As you said it is a cat&mouse game. Atm I have no more working bots.
Until now I have used https://pypi.org/project/selenium-stealth/ which didn't help me.
But now I am asking myself if I could combine uc with selenium-stealth?
I guess this question is hard to answer, so is there a simple solution to test the stealth technics applied with uc ? So that I can play around a bit and see if it is still working.