g1879 / DrissionPage

基于python的网页自动化工具。既能控制浏览器,也能收发数据包。可兼顾浏览器自动化的便利性和requests的高效率。功能强大,内置无数人性化设计和便捷功能。语法简洁而优雅,代码量少。
https://drissionpage.cn
BSD 3-Clause "New" or "Revised" License
7.17k stars 688 forks source link

can't baypass cf with drissionpage #294

Closed SaberTawfiq closed 1 week ago

SaberTawfiq commented 3 weeks ago

i can't baypass the cloudflare on this website https://visa.vfsglobal.com/egy/en/hrv/login

here is my codes need help

class CloudflareBypasser:
    def __init__(self, driver: ChromiumPage):
        self.driver = driver

    def clickCycle(self):
        try:
            if self.driver.wait.ele_displayed('#turnstile-wrapper',timeout=1.5):
                sleep(1.5)
                self.driver.ele("#turnstile-wrapper", timeout=2.5).click()
                current_time = datetime.now().strftime("%I:%M:%S %p")
                print(colorama.Fore.YELLOW + f"{current_time} - Verify you are human Found bypass...")
        except Exception as e:
            current_time = datetime.now().strftime("%I:%M:%S %p")
            print(f"Error details: {e}")
            print(colorama.Fore.RESET + f"{current_time} - Verify you are human. Not Found")

    def bypass(self):
            time.sleep(2)
            # A click may be enough to bypass the captcha, if your IP is clean.
            # I haven't seen a captcha that requires more than 3 clicks.
            time.sleep(2)
            self.clickCycle()
# THE MAIN PROCESS
def main_process():
    current_process_id = os.getpid()
    MAX_RETRIES = 10000  # Set the maximum number of retries
    RETRY_DELAY = 2  # Set the delay between retries in seconds
    retry_count = 0
    error_log_file = "error_log.txt"
    for proc in psutil.process_iter(['pid', 'name']):
        if 'python.exe' in proc.info['name'].lower() and proc.info['pid'] != current_process_id:
            subprocess.run(f"taskkill /f /pid {proc.info['pid']}", shell=True, check=True)

    if len(filtered_df) == 0 or all(filtered_df["status"].isin(excluded_words)):
        print("The Excel file is empty or all lines contain filtered words. Closing Python and browser.")
        quit()

    while retry_count < MAX_RETRIES:
        try:
            # Chromium Browser Path
            browser_path = "/usr/bin/google-chrome"
            options = ChromiumOptions()
            options.set_paths(browser_path=browser_path)
            # Some arguments to make the browser better for automation and less detectable.
            arguments = [
                "-no-first-run",
                "-force-color-profile=srgb",
                "-metrics-recording-only",
                "-password-store=basic",
                "-use-mock-keychain",
                "-export-tagged-pdf",
                "-no-default-browser-check",
                "-disable-background-mode",
                "-enable-features=NetworkService,NetworkServiceInProcess,LoadCryptoTokenExtension,PermuteTLSExtensions",
                "-disable-features=FlashDeprecationWarning,EnablePasswordsAccountStorage",
                "-deny-permission-prompts",
                "-disable-gpu",
            ]

            for argument in arguments:
                options.set_argument(argument)

            driver = ChromiumPage(addr_or_opts=options)
            #driver.set.window.size(1366, 768)
            driver.get('https://visa.vfsglobal.com/egy/en/hrv/login')
            current_time = datetime.now().strftime("%I:%M:%S %p")
            print(colorama.Fore.RESET + f"{current_time} - Site: VFS Global OPEN LOGIN driver")
            print("Title of the page: ", driver.title)
            check_notification(driver, excel_file_path)
            remove_destraction(driver, excel_file_path)
            cf_bypasser = CloudflareBypasser(driver)
            cf_bypasser.bypass()
            #captcha(driver, excel_file_path)
            check_notification(driver, excel_file_path)
            login(driver, excel_file_path)

        except Exception as e:
            retry_count += 1  # Increment the retry count
            current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
            error_message = f"{current_time} - Error occurred: {e}. Retrying... (Attempt {retry_count}/{MAX_RETRIES})\n"
            detailed_error_message = f"Details of the error:\n{e}\n"

            # Log the error to a file
            with open(error_log_file, "a", encoding="utf-8") as log_file:
                log_file.write(error_message)
                log_file.write(detailed_error_message)

            print(error_message)
            print(detailed_error_message)

            sleep(RETRY_DELAY)

            # Close the current driver instance
            driver.quit()

            # Restart the browser and the script by reinitializing the driver
            print("Restarting the browser and the script...")

    else:
        print(f"Maximum retry attempts ({MAX_RETRIES}) reached. Exiting...")
        quit()
KJHJason commented 3 weeks ago

If you look at the HTML elements of the website you provided using your browser's DevTools, you will not find an element with the ID turnstile-wrapper.

The logic you copied from another repo is intended to work for Cloudflare Challenge Pages, not Turnstile Captcha on a website so you will have to modify your clickCycle method.

SaberTawfiq commented 3 weeks ago

it was working and click on the verify you are human in the CF 'Under Attack' and the Cloudflare Captcha, with this

if self.driver.wait.ele_displayed('xpath://div/iframe',timeout=1.5):
            time.sleep(1.5)
            self.driver('xpath://div/iframe').ele("Verify you are human", timeout=2.5).click()

Now the iframe not fond because they add the cloudflare code in shadow dom

Untitled how to handle this?

KJHJason commented 3 weeks ago

it was working and click on the verify you are human in the CF 'Under Attack' and the Cloudflare Captcha, with this

if self.driver.wait.ele_displayed('xpath://div/iframe',timeout=1.5):
            time.sleep(1.5)
            self.driver('xpath://div/iframe').ele("Verify you are human", timeout=2.5).click()

Now the iframe not fond because they add the cloudflare code in shadow dom

Yes, with the iframe encapsulated within the closed shadow root, it makes it impossible to use XPath to get the iframe from the document root. Hence, the use of driver.wait.ele_displayed('#turnstile-wrapper',timeout=1.5) in the original CloudflareBypassForScraping repo which obtains the parent element of the encapsulated iframe that is just outside the closed shadow root. However, for Turnstile Captcha on a website, there could be multiple Turnstile Captchas. Hence, the removal of the #turnstile-wrapper unique element.

Following that similar logic, you could obtain the app-cloudflare-captcha-container tag or select the element with the class cf-turnstile-wrapper and call .click afterwards.

Tspm1eca commented 3 weeks ago

Does the CloudflareBypassForScraping method still work for you? Yes, it can locate #turnstile-wrapper, but it can't click on anything.

SaberTawfiq commented 2 weeks ago

Does the CloudflareBypassForScraping method still work for you? Yes, it can locate #turnstile-wrapper, but it can't click on anything.

That's Right

addame2 commented 2 weeks ago

@Tspm1eca @KJHJason @SaberTawfiq @yongchin0821 @g1879 anyone can improve my code I have bot for this site I can pay thanks if you are interested please ping me on telegram: @iamdev2 with screen short of this page

g1879 commented 2 weeks ago

@Tspm1eca @KJHJason @SaberTawfiq @yongchin0821 @g1879 anyone can improve my code I have bot for this site I can pay thanks if you are interested please ping me on telegram: @iamdev2 with screen short of this page

I don't have telegram. Find me at g1879@qq.com. Or at WeChat: green1879

memecodes commented 2 weeks ago

@Tspm1eca @KJHJason @SaberTawfiq @yongchin0821 @g1879 anyone can improve my code I have bot for this site I can pay thanks if you are interested please ping me on telegram: @iamdev2 with screen short of this page

I don't have telegram. Find me at g1879@qq.com. Or at WeChat: green1879

Added on wechat, do you support contentDocument, cloudflare added it

addame2 commented 1 week ago

any solution ?

TheFalloutOf76 commented 1 week ago

any solution ?

https://github.com/TheFalloutOf76/CDP-bug-MouseEvent-.screenX-.screenY-patcher