Closed hedior03 closed 4 years ago
mc-market.org seems to be working with 1.5.0 (and wasnt with the previous version) but I noticed, that as soon as I activate https://antcpt.com/eng/download/google-chrome-options.html it will lock again.
there is a minor error in the code https://github.com/ultrafunkamsterdam/undetected-chromedriver/blob/e8d4050f3cbd92979bf51683d7af1e370951ffb7/undetected_chromedriver/__init__.py#L152 it must be self
not self_
, does this maybe fix your issue?
Apparently passing your own Options() overrides the options in init.py. You should manually add them until he fixes the issue:
instance.add_argument("start-maximized")
instance.add_experimental_option("excludeSwitches", ["enable-automation"])
instance.add_argument("--disable-blink-features=AutomationControlled")
in my fork i took care of this one, its not a very good solution but it works. https://github.com/chwba/undetected-chromedriver/blob/master/undetected_chromedriver/__init__.py
Still though as mentioned above it seems that for some reason if i activate that captcha solver plugin, that still seems to be detected, would be great to have a workaround for that, because if we cant solve captchas anymore it will be difficult to access some sites.
the _ after self is intended
instance.add_argument("start-maximized") instance.add_experimental_option("excludeSwitches", ["enable-automation"]) instance.add_argument("--disable-blink-features=AutomationControlled")
These are the default, so no need to overwrite. All that is non-standard is unsupported.
Regarding platzi, get another ip, since yours might be flagged. cannot reproduce the issue :
In [1]: import undetected_chromedriver as uc
In [2]: driver = uc.Chrome()
Selenium patched. Safe to import Chrome / ChromeOptions
Selenium patched. Safe to import Chrome / ChromeOptions
DevTools listening on ws://127.0.0.1:19576/devtools/browser/642a3b15-d112-42bf-b222-9b89ae83649b
In [3]: driver.get('https://platzi.com/')
In [4]: driver.save_screenshot('platzi.png')
Out[4]: True
mc-market.org seems to be working with 1.5.0 (and wasnt with the previous version) but I noticed, that as soon as I activate https://antcpt.com/eng/download/google-chrome-options.html it will lock again.
there is a minor error in the code
it must be
self
notself_
, does this maybe fix your issue?
No it must not, in python it can be whatever you want 👍 Of course an anti-captha plugin is detected since javascript can just check what plugins are active, this is by design. I guess an anticaptcha-service and cloudflare are by definition "uncompatible" . I've had good results in the past using 2captcha. Another way to do it is using this library: https://github.com/Anorov/cloudflare-scrape
I see, in my fork the self_ had caused issues, probably this is due to my changes then.
Hmm but the plugin worked perfectly fine for a long time before cloudflare patched their procedure, isnt there maybe a script we could inject to return a fake list of plugins to bypass this?
Yes I am aware of https://github.com/Anorov/cloudflare-scrape and already had tried to implement it but if there is a captcha to be solved, invisible or visible during the browser check this library will fail even if he fixes it (currently broken). - The only way to make the plugin work I could imagine is to 'not load' it on first starting the chrome instance and loading it after the initial check is completed but sadly its not possible (afaik) to load a plugin after the chrome instance has already initialized/started the browser.
Regarding 2captcha, I could not figure out how to implement 2captcha to solve captchas and inject the solution while the browser is running. - I do know how to use 2captcha if using requests directly but this doesnt help me in this scenario.
EDIT: Also the nice part about the anti-captcha plugin is, that it will simply solve any captcha, that appears anywhere on the site which is a huge comfort if configured correctly.
EDIT2: I found https://intoli.com/blog/making-chrome-headless-undetectable/ and am trying to implement the plugins and languages part but im not js/selenium pro.
I tried:
if instance.execute_script("return navigator.languages"):
instance.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'languages', {
get: function() {
return ['en-US', 'en'];
},
});"""
})
if instance.execute_script("return navigator.plugins"):
instance.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'plugins', {
get: function () {
return [1, 2, 3, 4, 5];
},
});"""
})
that drops me a 'circular reference error' when starting.
The driver is still detected by cloudfare protection, Platzi is a website im trying to web-scrap and i haven't been able to.