Open vyanezz opened 10 months ago
hello, use this codes https://github.com/ultrafunkamsterdam/undetected-chromedriver/issues/1496#issue-1859296239 it works for CF ( worked for me)
Not working for me… any other solution? Yesterday was working correctly.
Thanks.
CF may have pushed another update yesterday. I had to make updates and release seleniumbase
4.17.9
last night for some existing scripts to bypass detection more frequently. Try pip install -U seleniumbase
, and then run the following script with python
: (Using https://github.com/seleniumbase/SeleniumBase)
from seleniumbase import Driver
import time
driver = Driver(uc=True)
driver.get("https://nowsecure.nl/#relax")
time.sleep(6)
driver.quit()
Navigating to https://hmaker.github.io/selenium-detector/ also bypasses detection with the latest version.
CF may have pushed another update yesterday. I had to make updates and release
seleniumbase
4.17.9
last night for some existing scripts to bypass detection more frequently. Trypip install -U seleniumbase
, and then run the following script withpython
: (Using https://github.com/seleniumbase/SeleniumBase)from seleniumbase import Driver import time driver = Driver(uc=True) driver.get("https://nowsecure.nl/#relax") time.sleep(6) driver.quit()
Navigating to https://hmaker.github.io/selenium-detector/ also bypasses detection with the latest version.
that working to me in https://nowsecure.nl/#relax
but it cant bypass https://hmaker.github.io/selenium-detector/
also i tested it in accounts.google.com, google detects me that i am using selenium or automatic things ( just made driver and did the work manually with the driver )
@YOTYTeaM If you're using regular driver commands, of course you'll be detected. Follow the documentation to write scripts correctly. For example, a Gmail login:
from seleniumbase import SB
with SB(uc=True) as sb:
sb.open("https://www.google.com/gmail/about/")
sb.click('a[data-action="sign in"]')
sb.type('input[type="email"]', "test123@gmail.com")
sb.click('button:contains("Next")')
import pdb; pdb.set_trace()
# sb.type('input[type="password"]', PASSWORD)
# sb.click('button:contains("Next")')
CF may have pushed another update yesterday. I had to make updates and release
seleniumbase
4.17.9
last night for some existing scripts to bypass detection more frequently. Trypip install -U seleniumbase
, and then run the following script withpython
: (Using https://github.com/seleniumbase/SeleniumBase)from seleniumbase import Driver import time driver = Driver(uc=True) driver.get("https://nowsecure.nl/#relax") time.sleep(6) driver.quit()
Navigating to https://hmaker.github.io/selenium-detector/ also bypasses detection with the latest version.
pip install seleniumbase but for some reason I get this error
ModuleNotFoundError: No module named 'seleniumbase'
@6echs Maybe you installed it into a different virtual environment, or you have different Python versions on your machine.
@mdmintz ,
I get the same error on VPS and other libraries do not give this error, it supports the latest version of Python, right?
@6echs Latest Python is supported.
Have you tried:
Mac/Linux:
python3 -m pip install -U seleniumbase
or Windows:
py -m pip install -U seleniumbase
@YOTYTeaM If you're using regular driver commands, of course you'll be detected. Follow the documentation to write scripts correctly. For example, a Gmail login:
from seleniumbase import SB with SB(uc=True) as sb: sb.open("https://www.google.com/gmail/about/") sb.click('a[data-action="sign in"]') sb.type('input[type="email"]', "test123@gmail.com") sb.click('button:contains("Next")') import pdb; pdb.set_trace() # sb.type('input[type="password"]', PASSWORD) # sb.click('button:contains("Next")')
wow , google cant detect me thank you
# user agent from fake_useragent
with SB(uc=True, agent=user_agent) as sb:
import pdb; pdb.set_trace()
but selenium detector still detection, detected same thing that i said things I know (not sure):
this variables in window.document, says that we are using selenium (window-document-aux-vars):
$chrome_asyncScriptInfo
and ^\$cdc_[a-zA-Z0-9]{22}_$
Selenium using Document.prototype.querySelector
and Document.prototype.querySelectorAll
to find element.
websites can overwriting these to detect selenium
and about the execute_script detection, dont know how is it works ( dont know js ) the codes that site using to detect execute_script:
class JSCallStackTest extends SeleniumDetectionTest {
constructor(name, desc, callStack, stackSignatures) {
super(name, desc);
this._callStack = callStack;
this._stackSignatures = stackSignatures;
}
test() {
if (this._callStack === null) return false;
for (let i = 1; i < this._callStack.length; i++) {
if (this._stackSignatures.some(signature => signature.test(this._callStack[i])))
return true;
}
return false;
}
}
/**
* see https://source.chromium.org/chromium/chromium/src/+/main:chrome/test/chromedriver/js/execute_script.js;l=13
* https://source.chromium.org/chromium/chromium/src/+/main:chrome/test/chromedriver/js/call_function.js;l=426
*/
class ExecuteScriptTest extends JSCallStackTest {
constructor(window, name, desc) {
super(name, desc, null, [/ executeScript /, / callFunction /]);
this._createToken(window);
}
_createToken(window) {
this.token = Math.random().toString().substring(2);
const self = this;
window.Object.defineProperty(window, 'token', {
configurable: false,
enumerable: false,
get: function() {
try {
null[0];
} catch(e) {
if (self._callStack === null)
self._callStack = e.stack.split('\n');
}
return self.token;
}
});
}
}
source: https://hmaker.github.io/selenium-detector/chromedriver.js
@YOTYTeaM The tasks asked by https://hmaker.github.io/selenium-detector/ are not realistic for regular sites. No regular site is going to ask you to run commands directly in the console to get a value, as it says for you to run execute_script
.
The anti-detection measures of SeleniumBase work for regular sites, especially page loads before you've clicked or entered text somewhere.
@6echs Latest Python is supported.
Have you tried:
Mac/Linux:
python3 -m pip install -U seleniumbase
or Windows:
py -m pip install -U seleniumbase
Works perfectly! Thx!
@mdmintz ,
I got it, bro, it works like a charm. I respect your work.
CF may have pushed another update yesterday. I had to make updates and release
seleniumbase
4.17.9
last night for some existing scripts to bypass detection more frequently. Trypip install -U seleniumbase
, and then run the following script withpython
: (Using https://github.com/seleniumbase/SeleniumBase)from seleniumbase import Driver import time driver = Driver(uc=True) driver.get("https://nowsecure.nl/#relax") time.sleep(6) driver.quit()
Navigating to https://hmaker.github.io/selenium-detector/ also bypasses detection with the latest version.
Not working when I do driver.get to the page I want to scrape…, what missing?
@vyanezz Is the page you're trying to scrape not returning a 403? Eg.
>>> import requests
>>> requests.get("https://nowsecure.nl/#relax").status_code
403
For 403s and some others, SeleniumBase modifies driver.get()
so that the modified chromedriver disconnects from Chrome for a few seconds to avoid detection. This can be a noticeable slowdown, so it's only set if requests.get()
is blocked (returns 403), which probably means that Selenium will be blocked too. If that's not the case, use the special driver.uc_open_with_tab(url)
to get that same result, even if not a 403 (or similar).
driver.uc_open_with_tab(url) # Use this instead of driver.get(url)
(Note that only SeleniumBase tests have that new driver.uc_open_with_tab(url)
command for UC Mode.)
driver.uc_open_with_tab(url)
@vyanezz Is the page you're trying to scrape not returning a 403? Eg.
>>> import requests >>> requests.get("https://nowsecure.nl/#relax").status_code 403
For 403s and some others, SeleniumBase modifies
driver.get()
so that the modified chromedriver disconnects from Chrome for a few seconds to avoid detection. This can be a noticeable slowdown, so it's only set ifrequests.get()
is blocked (returns 403), which probably means that Selenium will be blocked too. If that's not the case, use the specialdriver.uc_open_with_tab(url)
to get that same result, even if not a 403 (or similar).driver.uc_open_with_tab(url) # Use this instead of driver.get(url)
(Note that only SeleniumBase tests have that new
driver.uc_open_with_tab(url)
command for UC Mode.)
I've tried with driver.uc_open_with_tab(url) and also is detecting.. what I want to do is open an url and loop through others finding elements.
Thanks for all
```shell py -m pip install -U seleniumbase
Amazing stuff - your library is the best library that I have seen so far, works even with bet365 - I have just tested it:
https://github.com/hansalemaos/bet365_web_scraping/blob/main/betscrape2.py
@mdmintz ,
I get the same error on VPS and other libraries do not give this error, it supports the latest version of Python, right?
Datacenter ip will show captcha nonetheless, even when using normal Chrome browser. Nothing will fix that
@YOTYTeaM If you're using regular driver commands, of course you'll be detected. Follow the documentation to write scripts correctly. For example, a Gmail login:
from seleniumbase import SB with SB(uc=True) as sb: sb.open("https://www.google.com/gmail/about/") sb.click('a[data-action="sign in"]') sb.type('input[type="email"]', "test123@gmail.com") sb.click('button:contains("Next")') import pdb; pdb.set_trace() # sb.type('input[type="password"]', PASSWORD) # sb.click('button:contains("Next")')
This method isn't working for me on discogs.com -- any advice @mdmintz?
from seleniumbase import SB
with SB(uc=True) as sb:
sb.open("https://www.discogs.com/")
sb.click('#log_in_link')
sb.type('#username', username)
sb.type('#password', password)
# sb.submit('#password')
@lizfischer After you've gotten here...
from seleniumbase import SB
with SB(uc=True) as sb:
sb.open("https://discogs.com")
# ...
...then use https://github.com/asweigart/pyautogui to type text and click without invoking selenium commands in the browser, which would get you detected. SeleniumBase gets you through the front door to any site looking like a human. After that, if you need to perform actions while avoiding detection, use a tool like pyautogui
to perform those actions from outside a web browser. (I'm not the pyautogui
expert, so perhaps see that documentation and examples for using that.)
@YOTYTeaM If you're using regular driver commands, of course you'll be detected. Follow the documentation to write scripts correctly. For example, a Gmail login:
from seleniumbase import SB with SB(uc=True) as sb: sb.open("https://www.google.com/gmail/about/") sb.click('a[data-action="sign in"]') sb.type('input[type="email"]', "test123@gmail.com") sb.click('button:contains("Next")') import pdb; pdb.set_trace() # sb.type('input[type="password"]', PASSWORD) # sb.click('button:contains("Next")')
This method isn't working for me on discogs.com -- any advice @mdmintz?
from seleniumbase import SB with SB(uc=True) as sb: sb.open("https://www.discogs.com/") sb.click('#log_in_link') sb.type('#username', username) sb.type('#password', password) # sb.submit('#password')
You might want to try this. It is working on my PC as you can see here: https://www.youtube.com/watch?v=0H-BYbU8Gkg
import random
from time import sleep
from a_selenium_better_sendkeys import send_keys_alternative
from seleniumbase import Driver
import pandas as pd
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from a_selenium2df import get_df
from PrettyColorPrinter import add_printer
from myusernameandpass import user, password
def get_dataframe():
df = pd.DataFrame()
while df.empty:
df = get_df(
driver,
By,
WebDriverWait,
expected_conditions,
queryselector="*",
with_methods=True,
)
return df
add_printer(1)
driver = Driver(uc=True)
driver.get("https://www.discogs.com/")
df = get_dataframe()
df.loc[df.aa_innerHTML.str.contains("log_in_link", na=False, regex=True)].iloc[
-1
].se_click()
df = get_dataframe()
df2 = df.loc[df.aa_localName.str.contains("input")]
for q in user:
send_keys_alternative(driver, df2.element.iloc[0], q)
sleep(random.uniform(0.05, 0.1))
for q in password:
send_keys_alternative(driver, df2.element.iloc[1], q)
sleep(random.uniform(0.05, 0.1))
df.loc[df.aa_innerText.str.contains("Log in", na=False)].iloc[-1].se_click()
I am using SeleniumBase and 3 modules that I wrote: https://github.com/hansalemaos/a_selenium_better_sendkeys (faster, more reliable send_keys) https://github.com/hansalemaos/a_selenium2df (gets all webelements and their attributes/properties in one go [I hate using selectors - pandas is much better hahaha] https://github.com/hansalemaos/PrettyColorPrinter (make the DataFrame prettier) If it is not working, your IP might be blacklisted.
CF may have pushed another update yesterday. I had to make updates and release
seleniumbase
4.17.9
last night for some existing scripts to bypass detection more frequently. Trypip install -U seleniumbase
, and then run the following script withpython
: (Using https://github.com/seleniumbase/SeleniumBase)from seleniumbase import Driver import time driver = Driver(uc=True) driver.get("https://nowsecure.nl/#relax") time.sleep(6) driver.quit()
Navigating to https://hmaker.github.io/selenium-detector/ also bypasses detection with the latest version.
it gives me this error :
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: unknown error: cannot connect to chrome at 127.0.0.1:9222 from chrome not reachable Stacktrace:
@NourEssalam Looks like either Chrome isn't installed, or you ran a different script. That stack trace is coming from raw selenium, not seleniumbase. (I would need to see a seleniumbase stack trace to help.)
CF may have pushed another update yesterday. I had to make updates and release
seleniumbase
4.17.9
last night for some existing scripts to bypass detection more frequently. Trypip install -U seleniumbase
, and then run the following script withpython
: (Using https://github.com/seleniumbase/SeleniumBase)from seleniumbase import Driver import time driver = Driver(uc=True) driver.get("https://nowsecure.nl/#relax") time.sleep(6) driver.quit()
Navigating to https://hmaker.github.io/selenium-detector/ also bypasses detection with the latest version.
Thanks for theexample. But i have a problem in this. If i run this script without any vpn connection it works and bypasses cloudflare but otherwise , if vpn is enable then it cant bypass. Is there a solution for this. Btw i am using the same vpn connecitons on my local browsers and they are not causes any problem on cloudflare.
@msaidztrk If the VPN is causing issues, you can change proxy settings to get around that.
There's a proxy
option:
proxy=None, # Use proxy. Format: "SERVER:PORT" or "USER:PASS@SERVER:PORT".
@NourEssalam Looks like either Chrome isn't installed, or you ran a different script. That stack trace is coming from raw selenium, not seleniumbase. (I would need to see a seleniumbase stack trace to help.)
sbase get chromedriver 116
(I don't really know how to use it) def seleniumbase_test(self):
driver = Driver(uc=True)
driver.get("https://nowsecure.nl/#relax")
time.sleep(6)
driver.quit()
Here is the stack trace sir
Traceback (most recent call last):
File "/home/benslimane/callcenterEPR/django/manage.py", line 22, in
scraper.seleniumbase_test()
File "/home/benslimane/callcenterEPR/django/modules/scraping/BiziqueScraper.py", line 53, in seleniumbase_test
driver = Driver(uc=True)
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/seleniumbase/plugins/driver_manager.py", line 425, in Driver
driver = browser_launcher.get_driver(
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/seleniumbase/core/browser_launcher.py", line 1365, in get_driver
return get_local_driver(
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/seleniumbase/core/browser_launcher.py", line 3227, in get_localdriver
driver = undetected.Chrome(
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/seleniumbase/undetected/init.py", line 304, in init
super().init(options=options, service=service)
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/selenium/webdriver/chrome/webdriver.py", line 45, in init
super().init(
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/selenium/webdriver/chromium/webdriver.py", line 56, in init
super().init(
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 206, in init
self.start_session(capabilities)
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/seleniumbase/undetected/init.py", line 430, in start_session
super().start_session(capabilities)
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 290, in start_session
response = self.execute(Command.NEW_SESSION, caps)["value"]
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 345, in execute
self.error_handler.check_response(response)
File "/home/benslimane/callcenterEPR/django/.venv/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: cannot connect to chrome at 127.0.0.1:9222
from chrome not reachable
Stacktrace:
@msaidztrk If the VPN is causing issues, you can change proxy settings to get around that.
There's a
proxy
option:proxy=None, # Use proxy. Format: "SERVER:PORT" or "USER:PASS@SERVER:PORT".
thanks for your response but can you help with that code line you gave me. Where should i write that line into my main phyton code. And i am using vpns via chrome extensions. So i installed vpn into chrome driver and tried after opened vpn and it wont bypass cloudflare with it.
@msaidztrk For UC Mode with proxy:
from seleniumbase import Driver
import time
driver = Driver(uc=True, proxy="USER:PASS@IP:PORT") # With connection details
driver.get("https://nowsecure.nl/#relax")
time.sleep(6)
driver.quit()
@mdmintz Hello, is it possible to use it as multiple instances ? I tried to use threads but it doesn't work.
from seleniumbase import SB
with SB(uc=True) as sb:
sb.open("https://discogs.com")
# ...
@msaidztrk For UC Mode with proxy:
from seleniumbase import Driver import time driver = Driver(uc=True, proxy="USER:PASS@IP:PORT") # With connection details driver.get("https://nowsecure.nl/#relax") time.sleep(6) driver.quit()
@mdmintz Thanks for your reply again but i guess it wont work for my case. I dont write proxy specifically. After driver opens chrome , i'll install a vpn extension from chrome store and then automatically connecting to a proxy provided by the vpn plugin. So i dont and cant know the details of the which proxy i'll connect. And with any vpn connection , i cant bypass any cloudflare protection
@mdmintz is there no solution for my last response
CF is detecting Selenium.
some help? It's detecting from the first driver.get.
I'm using:
python 3.8 selenium 11.2 Chrome 116 UC with jdholtz PR
Thanks.