seleniumbase / SeleniumBase

📊 Blazing fast Python framework for web crawling, scraping, testing, and reporting. Supports pytest. Stealth abilities: UC Mode and CDP Mode.
https://seleniumbase.io
MIT License
5.43k stars 982 forks source link

Major updates have arrived in `4.28.0` (mostly for UC Mode) #2865

Open mdmintz opened 5 months ago

mdmintz commented 5 months ago

For anyone that hasn't been following https://github.com/seleniumbase/SeleniumBase/issues/2842, CF pushed an update that prevented UC Mode from easily bypassing CAPTCHA Turnstiles on Linux servers. Additionally, uc_click() was rendered ineffective for clicking Turnstile CAPTCHA checkboxes when clicking the checkbox was required. I've been working on solutions to these situations.

As I mentioned earlier in https://github.com/seleniumbase/SeleniumBase/issues/2842#issuecomment-2176310108, if CF detects either Selenium in the browser or JavaScript involvement in clicking the CAPTCHA, then they don't let the click through. (The JS-detection part is new.) I read online that CF employees borrowed ideas from https://github.com/kaliiiiiiiiii/brotector (a Selenium detector) in order to improve their CAPTCHA. Naturally, I was skeptical at first, but I have confirmed that the two algorithms do appear to get similar results. (Brotector was released 6 weeks ago, while the Cloudflare update happened 2 weeks ago.)

The solution to bypassing the improved CAPTCHAs requires using pyautogui to stay undetected. There was also the matter of how to make pyautogui work well on headless Linux servers. (Thanks to some ideas by @EnmeiRyuuDev in https://github.com/seleniumbase/SeleniumBase/issues/2842#issuecomment-2168829685, that problem was overcome by setting pyautogui._pyautogui_x11._display to Xlib.display.Display(os.environ['DISPLAY']) on Linux in order to sync up pyautogui with the X11 virtual display.)

The improved SeleniumBase UC Mode will have these new methods:

driver.uc_gui_press_key(key)  # Use PyAutoGUI to press the keyboard key

driver.uc_gui_press_keys(keys)  # Use PyAutoGUI to press a list of keys

driver.uc_gui_write(text)  # Similar to uc_gui_press_keys(), but faster

driver.uc_gui_handle_cf(frame="iframe")  # PyAutoGUI click CF Turnstile

It'll probably be easier to understand how those work via examples. Here's one for uc_gui_handle_cf based on the example in https://github.com/seleniumbase/SeleniumBase/issues/2842#issuecomment-2159004018:

import sys
from seleniumbase import SB

agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/126.0.0.0"
if "linux" in sys.platform:
    agent = None  # Use the default UserAgent

with SB(uc=True, test=True, rtf=True, agent=agent) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_reconnect(url, 4)
    sb.uc_gui_handle_cf()  # Ready if needed!
    sb.assert_element('input[name*="email"]')
    sb.assert_element('input[name*="login"]')
    sb.set_messenger_theme(location="bottom_center")
    sb.post_message("SeleniumBase wasn't detected!")

Above, I deliberately gave it an incomplete UserAgent so that CAPTCHA-clicking is required to advance. On macOS and Windows, the default UserAgent that SeleniumBase gives you is already enough to bypass the CAPTCHA screen entirely. The uc_gui_handle_cf() method is designed such that if there's no CAPTCHA that needs to be clicked on the page you're on, then nothing happens. Therefore, you can add the line whenever you think you'll encounter a CAPTCHA or not. In case there's more than one iframe on a website, you can specify the CSS Selector of the iframe as an arg when calling uc_gui_handle_cf(). There will be new examples in the SeleniumBase/examples/ folder for all the new UC Mode methods. To sum up, you may need to use the newer uc_gui_* methods in order to get past some CAPTCHAs on Linux where uc_click() worked previously.

On the topic of Brotector, (which is the open source bot-detector library that CF borrowed ideas from), there is a huge opportunity: Now that effective bot-detection software is available to the general public (all the code is open source!), anyone can now build their own CAPTCHA services (or just add CAPTCHAs to sites without the "service" part). I've already jumped on this with the Brotector CAPTCHA: https://seleniumbase.io/apps/brotector. I've also created a few test sites that utilize it:

I did make some improvements to the original Brotector algorithm in order to be suitable for CAPTCHAs: I needed a definite Allow/Block answer, rather than a number between 0 and 1 determining the likelihood of a bot, etc. I've been using these new test sites for testing the improved UC Mode.

That covers the major updates from 4.28.0 (with the exception of Brotector CAPTCHA test sites, which were already available to the public at the URLs listed above).

There will also be some other improvements:

Now, when using UC Mode on Linux, the default setting is NOT using headless mode. If for some reason you decide to use UC Mode and Headless Mode together, note that although Chrome will launch, you'll definitely be detected by anti-bots, and on top of that, pyautogui methods won't work. Use xvfb=True / --xvfb in order to be sure that the improved X11 virtual display on Linux activates. You'll need that for the uc_gui_* methods to work properly.

Much of that will get covered in the 3rd UC Mode video tutorial on YouTube (expected sometime in the near future).

In case anyone has forgotten, SeleniumBase is still a Test Automation Framework at heart, (which includes an extremely popular feature for stealth called "UC Mode"). UC Mode has gathered a lot of the attention, but SeleniumBase is more than just that.

mdmintz commented 5 months ago

4.28.0 has been released: https://github.com/seleniumbase/SeleniumBase/releases/tag/v4.28.0

The pyautogui example for a Cloudflare page with UC Mode:

Examples of bypassing the Brotector CAPTCHA with UC Mode:

Examples of how the Brotector CAPTCHA detects regular Selenium:

mdmintz commented 5 months ago

Here's an example script for Linux to prove it's working:

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_reconnect(url, 4)
    print(sb.get_page_title())
    sb.uc_gui_handle_cf()  # Ready if needed!
    print(sb.get_page_title())
    sb.assert_element('input[name*="email"]')
    sb.assert_element('input[name*="login"]')
    sb.set_messenger_theme(location="bottom_center")
    sb.post_message("SeleniumBase wasn't detected!")
Screenshot 2024-06-23 at 8 18 24 PM

The second print() should show "Virtual Manager", which means that the automation was able to get past the Turnstile.

vmolostvov commented 5 months ago

@mdmintz same problem here on linux vds (ubuntu without gpu), seleniumbase became unable to bypass the CloudFlare challenge. Using latest sb version. On local macos and windows keep working without any problem.

I can confirm that my issue on headless Linux Ubuntu was solved by 4.28.0

Снимок экрана 2567-06-24 в 11 50 54

Appreciate your work sir @mdmintz

SSujitX commented 5 months ago

The issue has been resolved after restarting my PC, but I didn't understand why this error happened.

It seems that after the latest update, the script has not opened any websites. This is the first time this issue has happened to me. The driver opens successfully but can not access the provided URL.

I update Pypi and Seleniumbase. Even created fresh virtualenv nothing happened. Is that a Chrome this ip:port 127.0.0.1:9222 issue?

image

error:

=============================================== {Login_Test_all.py:3:SB} starts =============================================== Traceback (most recent call last): File "c:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\Login_Test_all.py", line 3, in <module> with SB(uc=True, test=True, rtf=True) as sb: File "C:\Users\ssuji\AppData\Local\Programs\Python\Python312\Lib\contextlib.py", line 137, in __enter__ return next(self.gen) ^^^^^^^^^^^^^^ File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\seleniumbase\plugins\sb_manager.py", line 949, in SB sb.setUp() File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\seleniumbase\fixtures\base_case.py", line 14838, in setUp self.driver = self.get_new_driver( ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\seleniumbase\fixtures\base_case.py", line 4037, in get_new_driver new_driver = browser_launcher.get_driver( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\seleniumbase\core\browser_launcher.py", line 1841, in get_driver return get_local_driver( ^^^^^^^^^^^^^^^^^ File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\seleniumbase\core\browser_launcher.py", line 3784, in get_local_driver driver = undetected.Chrome( ^^^^^^^^^^^^^^^^^^ File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\seleniumbase\undetected\__init__.py", line 312, in __init__ super().__init__(options=options, service=service_) File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 45, in __init__ super().__init__( File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 66, in __init__ super().__init__(command_executor=executor, options=options) File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 212, in __init__ self.start_session(capabilities) File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\seleniumbase\undetected\__init__.py", line 475, in start_session super().start_session(capabilities) File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 299, in start_session response = self.execute(Command.NEW_SESSION, caps)["value"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 354, in execute self.error_handler.check_response(response) File "C:\Users\ssuji\OneDrive\Desktop\All Codes\Python Development\VFX Tool Telegram 6.0\.venv\Lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 229, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.SessionNotCreatedException: Message: session not created: cannot connect to chrome at 127.0.0.1:9222 from chrome not reachable

goldananas commented 5 months ago

Hey @mdmintz , do you think you will be working on making the uc_gui_handle_cf method compatible with the returnable Driver ? Works just fine on headless Linux with SB, but I can't find way to do the same with the returnable Driver instead.

mdmintz commented 5 months ago

@goldananas If using the Driver() format instead of SB(), you'll need need to spin up the special X11 virtual display yourself before launching the driver. (See https://github.com/seleniumbase/SeleniumBase/issues/2842#issuecomment-2168392303.)

With the SB() format, SB(uc=True, xvfb=True) does all that for you when running on Linux.

NCLnclNCL commented 4 months ago

I think they detect when we switch to the tickbox frame

mdmintz commented 4 months ago

Windows users should upgrade to 4.28.3 or newer (Fixes https://github.com/seleniumbase/SeleniumBase/issues/2889 on 4.28.2)

JimKarvo commented 4 months ago

Seems that the CF detected the new way of bypassing.

Sometimes the click works (not always) image

but after that, the checkbox is failed image

mdmintz commented 4 months ago

macOS: ✅ Windows: ✅ Linux with natural GUI on residential IP: ✅ Linux without GUI on non-residential IP: ❌ Linux without GUI on residential IP: ⚠️ / ❓

So much for the free pass on GitHub Actions CAPTCHA bypassing. 😄 I didn't expect that loophole to last long.

JimKarvo commented 4 months ago

@mdmintz I forgot to mention that I am running an Ubuntu server, no GUI

mdmintz commented 4 months ago

@JimKarvo Residential IP or non-residential?

OpsecGuy commented 4 months ago

@mdmintz In my case I also have some issues with the bypass. We talk about the 4.28.3 version of the seleniumbase. On Windows there are no issues, however on my Linux (Ubuntu 20) VM with GUI #https://github.com/seleniumbase/SeleniumBase/blob/master/examples/raw_pyautogui.py In that script, I just edit the URL to the website that at first connect shows Cloudflare CF captcha. The same IP that successfully bypasses the captcha on Windows doesn't want to work on Linux with GUI. On the bare-metal server where I have Ubuntu 22 installed, I'm also stuck on the CF captcha page and experimenting with the reconnect timeout doesn't solve my issues.

My user agent on both Linux machines is: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36 Google Chrome version is 126.0.6478.126 I tested with my residential IP and the other residential proxies.

// Tested original code from raw_pyautogui.py and looks like it worked, but on any other website I test I get this: alt text

mdmintz commented 4 months ago

The last successful GitHub Actions run for bypassing Cloudflare's Turnstile was https://github.com/mdmintz/undetected-testing/actions/runs/9748457978/job/26903480495 8 hours ago. Likely their QA Team did not initially catch that their Turnstiles were getting bypassed on GitHub Actions until they came over to the SeleniumBase repo and read the notes.

Screenshot 2024-07-01 at 10 03 55 PM
gabrielsim commented 4 months ago

Linux without GUI on residential IP: ⚠️ / ❓

@mdmintz fyi, Linux without GUI on residential IPs still works for me

mdmintz commented 4 months ago

@gabrielsim That's good news: That means the algorithm works right now when the IP Address hasn't been blocked already. When it worked earlier on GitHub Actions, it was due to a bug on Cloudflare's end when then forget to check IP ranges for known non-residential server addresses. They finally fixed it: Likely after reading this thread and learning about the loophole.

No changes are needed for UC Mode at this time. However, Brotector still has some bot-checks that Cloudflare hasn't picked up yet. This would allow them to detect switching into an iframe, as well the JavaScript for making an element the active one. There's already a plan in place for that scenario, involving pyautogui for more things, and not just clicking the active element.

OpsecGuy commented 4 months ago

@mdmintz In my case I also have some issues with the bypass. We talk about the 4.28.3 version of the seleniumbase. On Windows there are no issues, however on my Linux (Ubuntu 20) VM with GUI #https://github.com/seleniumbase/SeleniumBase/blob/master/examples/raw_pyautogui.py In that script, I just edit the URL to the website that at first connect shows Cloudflare CF captcha. The same IP that successfully bypasses the captcha on Windows doesn't want to work on Linux with GUI. On the bare-metal server where I have Ubuntu 22 installed, I'm also stuck on the CF captcha page and experimenting with the reconnect timeout doesn't solve my issues.

My user agent on both Linux machines is: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36 Google Chrome version is 126.0.6478.126 I tested with my residential IP and the other residential proxies.

// Tested original code from raw_pyautogui.py and looks like it worked, but on any other website I test I get this: alt text

That's very strange, however any website that I test now, bypass seems to be working properly... Idk what CF team is doing, but I believe the prepare another big update for us.

enricodvn commented 4 months ago

Hey guys, just to add to this discussion, CF is now detecting residential proxies with ML:

https://blog.cloudflare.com/residential-proxy-bot-detection-using-machine-learning

The residential proxies I used became pretty much useless since mid June :/

amberbor commented 4 months ago

Hey guys

can somebody help me i am trying on macos to get the dexscreener and to bypass cloudflare but it doesnt work

import time

from seleniumbase import SB

with SB(uc=True, xvfb=True) as sb: url = "https://dexscreener.com" sb.uc_open_with_reconnect(url, 4) print(sb.get_page_title()) sb.uc_gui_handle_cf() # Ready if needed! print(sb.get_page_title())

time.sleep(70)
JimKarvo commented 4 months ago

Updates: Nothing to do with blocked IP or proxies.

I have scripts running on Windows machines (headed) on my home IP. I forward all traffic from ubuntu server through my home IP. The first 3-5 pages are ok. After that pages CF appears to my browser. SeleniumBase still can't bypass it meanwhile at my windows pc, I can bypass with no problems.

amberbor commented 4 months ago

https://github.com/seleniumbase/SeleniumBase/assets/47393618/4b6f266d-bcb9-4083-a68a-f4b5b8d3346d

mdmintz commented 4 months ago

@amberbor The best user-agent to use is the default one that SeleniumBase sets for you automatically.

amberbor commented 4 months ago

@amberbor The best user-agent to use is the default one that SeleniumBase sets for you automatically.

@mdmintz Thanks for you reply . First time that i ran the code was without user agent , but the problem is that in chrome it doesnt show the checkbox of cloudflare when i run this code . I provided the video so you can see that it doesnt show the checkbox , it loads all the time .

another example is this , and i have the same output as the one that i send with video

from seleniumbase import SB

with SB(uc=True, incognito=True) as sb: url = "https://dexscreener.com" sb.uc_open_with_reconnect(url, 10) print(sb.get_page_title()) sb.uc_gui_handle_cf() print(sb.get_page_title())

digicodexx commented 4 months ago

@amberbor, add sb.sleep(5) before the sb.uc_gui_handle_cf() line so it doesn't click the checkbox instantly.

amberbor commented 4 months ago

@amberbor, add sb.sleep(5) before the sb.uc_gui_handle_cf() line so it doesn't check the checkbox instantly.

@digicodexx still the same issue , even if i open an incongito mode it doesnt show the checkbox . Here is the example with sb.sleep(5) even if i set sb.sleep(10) still it doesnt show the checkbox

https://github.com/seleniumbase/SeleniumBase/assets/47393618/551a027c-2fb9-4bc0-87c1-93504c80d2be

chlwodud77 commented 4 months ago

To bypass recent cloudflare, I used locateOnScreen method within pyautogui.

It works fine in aws window server. But, didn't work in aws ubuntu server. 😥 (without GUI)

I attach how I did to help you guys bypass cloudflare.

Please let me know if anyone did pass within aws ubuntu server environment. @mdmintz I hope it would be helpful bypassing recent cloudflare.

agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/126.0.0.0" if "linux" in sys.platform: agent = None

with SB(test=True, uc=True, rtf=True, locale_code="en") as sb: url = "https://dexscreener.com/" sb.uc_open_with_disconnect(url, 3.5) sb.uc_gui_click_screenshot(screenshot_path=r"C:/data/test/checkbox_en.png")



Did use screenshot below
![checkbox_en](https://github.com/seleniumbase/SeleniumBase/assets/22047622/28feb3dc-aeee-4804-af95-4f2f30c452d1)

https://github.com/seleniumbase/SeleniumBase/assets/22047622/24c85e59-f71a-4647-a32b-54051cb74379
xbexbex commented 4 months ago

Seems that the CF detected the new way of bypassing.

Sometimes the click works (not always) image

but after that, the checkbox is failed image

I have this exact same issue. In fact, even manually clicking the checkbox fails.

This is using a Linux virtual machine in WSL2. I can bypass the CF just fine using chrome in the Windows side. With the same IP, obviously.

jens4626 commented 4 months ago

To bypass recent cloudflare, I used locateOnScreen method within pyautogui.

It works fine in aws window server. But, didn't work in aws ubuntu server. 😥 (without GUI)

I attach how I did to help you guys bypass cloudflare.

Please let me know if anyone did pass within aws ubuntu server environment. @mdmintz I hope it would be helpful bypassing recent cloudflare.

  • Add uc_gui_click_screenshot method (SeleniumBase > seleniumbase > core > browser_launcher.py)
def uc_gui_click_screenshot(driver, screenshot_path):
    install_pyautogui_if_missing(driver)
    import pyautogui
    pyautogui = get_configured_pyautogui(pyautogui)
    gui_lock = fasteners.InterProcessLock(
        constants.MultiBrowser.PYAUTOGUILOCK
    )
    with gui_lock:
        button7location = pyautogui.locateOnScreen(screenshot_path)
        buttonx, buttony = pyautogui.center(button7location)
        pyautogui.click(buttonx, buttony) 
  • Execute with test code
from seleniumbase import SB
import sys

agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/126.0.0.0"
if "linux" in sys.platform:
    agent = None

with SB(test=True, uc=True, rtf=True, locale_code="en") as sb:
    url = "https://dexscreener.com/"
    sb.uc_open_with_disconnect(url, 3.5)
    sb.uc_gui_click_screenshot(screenshot_path=r"C:/data/test/checkbox_en.png")

Did use screenshot below checkbox_en

seleniumbase_test.mp4

I just noticed uc_gui_handle_cf stopped working on Windows for me, so I'm hoping this solves my issue aswell! Regarding the screenshot you used, will it still work incase the browser is in dark mode, or will it require another screenshot?

max-frai commented 4 months ago

I don't use this library, but just found this thread because my scripts also failed a few days ago. I'm using slightly different method of controlling everything, but yes, they started detecting any js related clicks, that's the main problem. For me residential proxies + ubuntu + Xvfb + fluxbox works okay. I had some problems with moving everything into docker container but it works without docker. So I have checkbox and I can't click it with code, but I can click it with mouse using remote desktop control x11vnc and it works. So the main problem for now is click detection.

chlwodud77 commented 4 months ago

To bypass recent cloudflare, I used locateOnScreen method within pyautogui. It works fine in aws window server. But, didn't work in aws ubuntu server. 😥 (without GUI) I attach how I did to help you guys bypass cloudflare. Please let me know if anyone did pass within aws ubuntu server environment. @mdmintz I hope it would be helpful bypassing recent cloudflare.

  • Add uc_gui_click_screenshot method (SeleniumBase > seleniumbase > core > browser_launcher.py)
def uc_gui_click_screenshot(driver, screenshot_path):
    install_pyautogui_if_missing(driver)
    import pyautogui
    pyautogui = get_configured_pyautogui(pyautogui)
    gui_lock = fasteners.InterProcessLock(
        constants.MultiBrowser.PYAUTOGUILOCK
    )
    with gui_lock:
        button7location = pyautogui.locateOnScreen(screenshot_path)
        buttonx, buttony = pyautogui.center(button7location)
        pyautogui.click(buttonx, buttony) 
  • Execute with test code
from seleniumbase import SB
import sys

agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/126.0.0.0"
if "linux" in sys.platform:
    agent = None

with SB(test=True, uc=True, rtf=True, locale_code="en") as sb:
    url = "https://dexscreener.com/"
    sb.uc_open_with_disconnect(url, 3.5)
    sb.uc_gui_click_screenshot(screenshot_path=r"C:/data/test/checkbox_en.png")

Did use screenshot below checkbox_en seleniumbase_test.mp4

I just noticed uc_gui_handle_cf stopped working on Windows for me, so I'm hoping this solves my issue aswell! Regarding the screenshot you used, will it still work incase the browser is in dark mode, or will it require another screenshot?

@jens4626 Yes, I think you should use screenshot of dark mode chrome.

JimKarvo commented 4 months ago

To bypass recent cloudflare, I used locateOnScreen method within pyautogui. It works fine in aws window server. But, didn't work in aws ubuntu server. 😥 (without GUI) I attach how I did to help you guys bypass cloudflare. Please let me know if anyone did pass within aws ubuntu server environment. @mdmintz I hope it would be helpful bypassing recent cloudflare.

  • Add uc_gui_click_screenshot method (SeleniumBase > seleniumbase > core > browser_launcher.py)
def uc_gui_click_screenshot(driver, screenshot_path):
    install_pyautogui_if_missing(driver)
    import pyautogui
    pyautogui = get_configured_pyautogui(pyautogui)
    gui_lock = fasteners.InterProcessLock(
        constants.MultiBrowser.PYAUTOGUILOCK
    )
    with gui_lock:
        button7location = pyautogui.locateOnScreen(screenshot_path)
        buttonx, buttony = pyautogui.center(button7location)
        pyautogui.click(buttonx, buttony) 
  • Execute with test code
from seleniumbase import SB
import sys

agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/126.0.0.0"
if "linux" in sys.platform:
    agent = None

with SB(test=True, uc=True, rtf=True, locale_code="en") as sb:
    url = "https://dexscreener.com/"
    sb.uc_open_with_disconnect(url, 3.5)
    sb.uc_gui_click_screenshot(screenshot_path=r"C:/data/test/checkbox_en.png")

Did use screenshot below checkbox_en seleniumbase_test.mp4

I just noticed uc_gui_handle_cf stopped working on Windows for me, so I'm hoping this solves my issue aswell! Regarding the screenshot you used, will it still work incase the browser is in dark mode, or will it require another screenshot?

============================================================================== {delete.py:9:SB} starts ===============================================================================
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/seleniumbase/plugins/sb_manager.py", line 964, in SB
    yield sb
  File "/root/delete.py", line 12, in <module>
    sb.uc_gui_click_screenshot(screenshot_path="checkbox_en.png")
AttributeError: 'BaseCase' object has no attribute 'uc_gui_click_screenshot'
========================================================================== {delete.py:9:SB} failed in 8.01s ==========================================================================
Traceback (most recent call last):
  File "/root/delete.py", line 12, in <module>
    sb.uc_gui_click_screenshot(screenshot_path="checkbox_en.png")
AttributeError: 'BaseCase' object has no attribute 'uc_gui_click_screenshot'

Maybe I have to register the function somewhere else?

chlwodud77 commented 4 months ago

To bypass recent cloudflare, I used locateOnScreen method within pyautogui. It works fine in aws window server. But, didn't work in aws ubuntu server. 😥 (without GUI) I attach how I did to help you guys bypass cloudflare. Please let me know if anyone did pass within aws ubuntu server environment. @mdmintz I hope it would be helpful bypassing recent cloudflare.

  • Add uc_gui_click_screenshot method (SeleniumBase > seleniumbase > core > browser_launcher.py)
def uc_gui_click_screenshot(driver, screenshot_path):
    install_pyautogui_if_missing(driver)
    import pyautogui
    pyautogui = get_configured_pyautogui(pyautogui)
    gui_lock = fasteners.InterProcessLock(
        constants.MultiBrowser.PYAUTOGUILOCK
    )
    with gui_lock:
        button7location = pyautogui.locateOnScreen(screenshot_path)
        buttonx, buttony = pyautogui.center(button7location)
        pyautogui.click(buttonx, buttony) 
  • Execute with test code
from seleniumbase import SB
import sys

agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/126.0.0.0"
if "linux" in sys.platform:
    agent = None

with SB(test=True, uc=True, rtf=True, locale_code="en") as sb:
    url = "https://dexscreener.com/"
    sb.uc_open_with_disconnect(url, 3.5)
    sb.uc_gui_click_screenshot(screenshot_path=r"C:/data/test/checkbox_en.png")

Did use screenshot below checkbox_en seleniumbase_test.mp4

I just noticed uc_gui_handle_cf stopped working on Windows for me, so I'm hoping this solves my issue aswell! Regarding the screenshot you used, will it still work incase the browser is in dark mode, or will it require another screenshot?

============================================================================== {delete.py:9:SB} starts ===============================================================================
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/seleniumbase/plugins/sb_manager.py", line 964, in SB
    yield sb
  File "/root/delete.py", line 12, in <module>
    sb.uc_gui_click_screenshot(screenshot_path="checkbox_en.png")
AttributeError: 'BaseCase' object has no attribute 'uc_gui_click_screenshot'
========================================================================== {delete.py:9:SB} failed in 8.01s ==========================================================================
Traceback (most recent call last):
  File "/root/delete.py", line 12, in <module>
    sb.uc_gui_click_screenshot(screenshot_path="checkbox_en.png")
AttributeError: 'BaseCase' object has no attribute 'uc_gui_click_screenshot'

Maybe I have to register the function somewhere else?

To work properly, need to add something else.

  1. SeleniumBase/seleniumbase/core/browser_launcher.py Add below code in the get_local_driver method (After line 4039)
    
    driver.uc_gui_click_screenshot= (
    lambda *args, **kwargs: uc_gui_click_screenshot(
        driver, *args, **kwargs
    )
    )

2. `SeleniumBase/seleniumbase/fixtures/base_case.py`
Add below code in the `get_new_driver` method (After line 4184)
```python
if hasattr(new_driver, "uc_gui_click_screenshot"):
    self.uc_gui_click_screenshot= new_driver.uc_gui_click_screenshot
mdmintz commented 4 months ago

Clicking via screenshot isn't needed since clicking via pyautogui spacebar works.

mike007-1 commented 4 months ago

it does not work anymore on windows and linux with VPN / datacenter IP.. It seems each action using "driver" is being detected now. The solution provided by chlwodud77 works fine on Windows without using multiprocessing

mdmintz commented 4 months ago

Then you would use something like:

    sb.uc_open_with_disconnect(url, 4)
    sb.uc_gui_press_keys("\t" + " ")
    sb.reconnect(3)

There, you open a URL and stay disconnected. Then press tab and spacebar with pyautogui. And then reconnect / connect after that. No Selenium involved. There was already an example of that in case that scenario happened: SeleniumBase/examples/raw_hobbit.py. Only problem would be multithreading, because changing the active window would effect the click.

chlwodud77 commented 4 months ago

Seems using tab + spacebar key by pyautogui and in real case doesn't work on cloudflare anymore. 😥

I tried 2 cases without using seleniumbase in aws windows server. (windows / non-residential IP)

https://github.com/seleniumbase/SeleniumBase/assets/22047622/ccc5da39-b271-4ffe-89a3-8fa5e29ec640

https://github.com/seleniumbase/SeleniumBase/assets/22047622/11e04040-9df6-485c-a3d9-a97ba631aff9

mdmintz commented 4 months ago

Let's all standardize on the same test site so that we can compare notes more easily: Use https://www.virtualmanager.com/en/login for all examples. Show code. Eg:

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_reconnect(url, 4)
    sb.uc_gui_handle_cf()  # Ready if needed!
    sb.assert_element('input[name*="email"]')
chlwodud77 commented 4 months ago

Oh, okay.

Tried with tab + spacebar didn't work in https://www.virtualmanager.com/en/login Windows / non-residential IP

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_disconnect(url, 4)
    sb.uc_gui_press_keys("\t" + " ")
    sb.reconnect(3)

https://github.com/seleniumbase/SeleniumBase/assets/22047622/fcedbf04-f94f-45a3-97e5-d543dd415e8c

mdmintz commented 4 months ago

@chlwodud77 And did a different click work for you? Both the uc_gui_handle_cf() and the tab+spacebar ways worked for me on both macOS and Windows (Residential IP).

chlwodud77 commented 4 months ago

@mdmintz no, using uc_gui_handle_cf() also didn't work on Windows (non-residential IP)

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_reconnect(url, reconnect_time=2)
    sb.uc_gui_handle_cf()
    sb.assert_element("img#captcha-success", timeout=3)
    sb.set_messenger_theme(location="top_left")
    sb.post_message("SeleniumBase wasn't detected", duration=3)

https://github.com/seleniumbase/SeleniumBase/assets/22047622/d30eb10f-4c3a-4396-af95-5f1bcea80131

mdmintz commented 4 months ago

@chlwodud77 I would imagine that non-residential IPs have already been identified and flagged, which would cause all attempts to bypass those CAPTCHAs to fail, unless there is some evidence to indicate that those CAPTCHAs can still be bypassed by other means. Hence the reason for residential proxies being used from non-residential IPs.

Allmight3 commented 4 months ago

@chlwodud77 I used your code below and it successfully worked for me:

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_disconnect(url, 4)
    sb.uc_gui_press_keys("\t" + " ")
    sb.reconnect(3)

I used it without a proxy and it instantly passed. I then used a mobile proxy and had to adjust the disconnect/reconnect timings, but it also passed. So, assuming you're completely up to date, perhaps the issue is your proxy or something else.

chlwodud77 commented 4 months ago

@Allmight3 Without a proxy, were you in residential ip ? or non-residential ip?

Allmight3 commented 4 months ago

@chlwodud77 Without a proxy, I have a residential IP address from korea and it passed automatically without any need for clicking. With a proxy, using a mobile USA proxy, it did require the click which the code did automatically and it passed.

jens4626 commented 4 months ago

@chlwodud77 I would imagine that non-residential IPs have already been identified and flagged, which would cause all attempts to bypass those CAPTCHAs to fail, unless there is some evidence to indicate that those CAPTCHAs can still be bypassed by other means. Hence the reason for residential proxies being used from non-residential IPs.

By that do you mean that even if we click manually it should fail?

I tested:

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_reconnect(url, 4)
    sb.uc_gui_handle_cf()  # Ready if needed!
    sb.assert_element('input[name*="email"]')

Failed

https://github.com/seleniumbase/SeleniumBase/assets/45258332/3bf62b41-9ce6-4977-beac-a06bed70565b

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_reconnect(url, 4)
    Me clicking it manually

Bypass

https://github.com/seleniumbase/SeleniumBase/assets/45258332/e0001c03-f4b6-4ab5-9126-a66a201d1e42

mdmintz commented 4 months ago

@jens4626 If you want to perform manual actions on that page, then the driver has to be disconnected.

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_reconnect(url, "breakpoint")

Type c and press Enter to continue from the breakpoint.

That was in the UC Mode docs.

jens4626 commented 4 months ago

@jens4626 If you want to perform manual actions on that page, then the driver has to be disconnected.

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "https://www.virtualmanager.com/en/login"
    sb.uc_open_with_reconnect(url, "breakpoint")

Type c and press Enter to continue from the breakpoint.

That was in the UC Mode docs.

I don't want to perform manual actions - I just wanted to show that using uc_open_with_reconnect + non-residential IP does not cause any flags due to the fact that if I click the checkbox it will bypass it. So its the uc_gui_handle_cf that gets detected

mdmintz commented 4 months ago

In case anyone was wondering where my top stalkers are from:

Screenshot

Yeah... they're watching. 👀

max-frai commented 4 months ago

https://github.com/seleniumbase/SeleniumBase/assets/24208746/8cadeeb7-b0a3-408c-8d99-8aae84d95ea7

I will not reveal all the cards, but it's possible. We managed to bypass everything. We don't disconnect CDP session, don't use image to find checkbox by pattern. Also we have custom recorded mouse movements which we slightly modify each time and replicate (but looks like it's not required for cloudflare, but useful for recaptcha).

The only problem is running chrome inside docker. It would be better to manage everything, but I wasn't able to understand what's exactly they are detecting and what's leaking.

So for me ubuntu + headfull + isp/residential ips works stable now.

OpsecGuy commented 4 months ago

github.mov I will not reveal all the cards, but it's possible. We managed to bypass everything. We don't disconnect CDP session, don't use image to find checkbox by pattern. Also we have custom recorded mouse movements which we slightly modify each time and replicate (but looks like it's not required for cloudflare, but useful for recaptcha).

The only problem is running chrome inside docker. It would be better to manage everything, but I wasn't able to understand what's exactly they are detecting and what's leaking.

So for me ubuntu + headfull + isp/residential ips works stable now.

gj, but bypassing their captcha through their callback is still the best option :)