vvanglro / cf-clearance

Purpose To make a cloudflare v2 challenge pass successfully, Can be use cf_clearance bypassed by cloudflare, However, with the cf_clearance, make sure you use the same IP and UA as when you got it.
https://github.com/vvanglro/cf_clearance
353 stars 58 forks source link

find some piece of cake #24

Closed zhajingwen closed 1 year ago

zhajingwen commented 1 year ago

l rewrite the retry moudle like below

# -*- coding: utf-8 -*-
import logging
from traceback import format_exc
from playwright.async_api import Error
from playwright.async_api import Page as AsyncPage
from playwright.sync_api import Page as SyncPage

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger('verify humman')

async def async_cf_retry(page: AsyncPage, tries: int = 10) -> bool:
    success = False
    while tries != 0:
        await page.wait_for_timeout(1500)
        try:
            success = False if await page.query_selector("#challenge-form") else True
            # 确认你是真人
            click_button = await page.query_selector("#challenge-stage > div > input")
            if click_button:
                await click_button.click()
            logger.info('human first successfully')
            iframe = await page.query_selector(
                "xpath=//div[@class='hcaptcha-box']/iframe"
            )
            logger.info('try to verify twice')
            if iframe:
                switch_iframe = await iframe.content_frame()
                iframe_button = await switch_iframe.query_selector(
                    "xpath=//*[@id='cf-stage']//label/span"
                )
                if iframe_button:
                    await iframe_button.click()
                    logger.info('human twice click successfully')
                else:
                    logger.error('not iframe button')
            else:
                logger.error('twice without iframe')
        except Error:
            logger.error(format_exc())
            success = False
        if success:
            break
        tries -= 1
    return success

def sync_cf_retry(page: SyncPage, tries: int = 10) -> bool:
    success = False
    while tries != 0:
        page.wait_for_timeout(1500)
        try:
            success = False if page.query_selector("#challenge-form") else True
            click_button = page.query_selector("#challenge-stage > div > input")
            if click_button:
                click_button.click()
            iframe = page.query_selector("xpath=//div[@class='hcaptcha-box']/iframe")
            if iframe:
                iframe_button = iframe.content_frame().query_selector(
                    "xpath=//*[@id='cf-stage']//label/span"
                )
                if iframe_button:
                    iframe_button.click()
        except Error:
            success = False
        if success:
            break
        tries -= 1

    return success

when l run the below test code

import asyncio
import time

from playwright.async_api import async_playwright
from cf_clearance import async_cf_retry, async_stealth
from pyvirtualdisplay import Display

async def main():
    with Display():

        async with async_playwright() as p:
            browser = await p.chromium.launch(headless=False)
            context = await browser.new_context(
                user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36'
            )
            page = await context.new_page()
            await async_stealth(page, pure=True)
            await page.goto('https://dexfilter.com/')
            res = await async_cf_retry(page)
            if res:
                cookies = await page.context.cookies()
                for cookie in cookies:
                    if cookie.get('name') == 'cf_clearance':
                        cf_clearance_value = cookie.get('value')
                        print(cf_clearance_value)
                ua = await page.evaluate('() => {return navigator.userAgent}')
                print(ua)
            else:
                print("cf challenge fail")
            time.sleep(3)
            await page.screenshot(path='t.png')
            await browser.close()

asyncio.get_event_loop().run_until_complete(main())

And l get the log like this

root@011b80959cd5:/workspace# python test.py
2023-02-28 11:22:04,020 - verify humman - INFO - human first successfully
2023-02-28 11:22:04,027 - verify humman - INFO - try to verify twice
2023-02-28 11:22:04,057 - verify humman - ERROR - not iframe button
2023-02-28 11:22:05,572 - verify humman - INFO - human first successfully
2023-02-28 11:22:05,579 - verify humman - INFO - try to verify twice
2023-02-28 11:22:05,680 - verify humman - ERROR - not iframe button
2023-02-28 11:22:07,195 - verify humman - INFO - human first successfully
2023-02-28 11:22:07,201 - verify humman - INFO - try to verify twice
2023-02-28 11:22:07,284 - verify humman - INFO - human twice click successfully
2023-02-28 11:22:08,803 - verify humman - INFO - human first successfully
2023-02-28 11:22:08,806 - verify humman - INFO - try to verify twice
2023-02-28 11:22:08,806 - verify humman - ERROR - twice without iframe
2023-02-28 11:22:10,336 - verify humman - INFO - human first successfully
2023-02-28 11:22:10,339 - verify humman - INFO - try to verify twice
2023-02-28 11:22:10,339 - verify humman - ERROR - twice without iframe
2023-02-28 11:22:11,987 - verify humman - INFO - human first successfully
2023-02-28 11:22:11,990 - verify humman - INFO - try to verify twice
2023-02-28 11:22:11,990 - verify humman - ERROR - twice without iframe
2023-02-28 11:22:13,503 - verify humman - INFO - human first successfully
2023-02-28 11:22:13,509 - verify humman - INFO - try to verify twice
2023-02-28 11:22:13,536 - verify humman - ERROR - not iframe button
2023-02-28 11:22:15,049 - verify humman - INFO - human first successfully
2023-02-28 11:22:15,054 - verify humman - INFO - try to verify twice
2023-02-28 11:22:15,144 - verify humman - INFO - human twice click successfully
2023-02-28 11:22:17,917 - verify humman - ERROR - Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/cf_clearance/retry.py", line 16, in async_cf_retry
    success = False if await page.query_selector("#challenge-form") else True
  File "/usr/local/lib/python3.8/dist-packages/playwright/async_api/_generated.py", line 7864, in query_selector
    await self._impl_obj.query_selector(selector=selector, strict=strict)
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_page.py", line 352, in query_selector
    return await self._main_frame.query_selector(selector, strict)
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_frame.py", line 303, in query_selector
    await self._channel.send("querySelector", locals_to_params(locals()))
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_connection.py", line 44, in send
    return await self._connection.wrap_api_call(
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_connection.py", line 419, in wrap_api_call
    return await cb()
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_connection.py", line 79, in inner_send
    result = next(iter(done)).result()
playwright._impl._api_types.Error: Execution context was destroyed, most likely because of a navigation

2023-02-28 11:22:19,444 - verify humman - INFO - human first successfully
2023-02-28 11:22:19,448 - verify humman - INFO - try to verify twice
2023-02-28 11:22:19,448 - verify humman - ERROR - twice without iframe
e_pnROvzs76yTl.l4CQHRyq46qKuA0XFQImcPAQbibU-1677554536-0-250
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36
root@011b80959cd5:/workspace#

you can find l have try 9 times ,then successully and it get fail in many times, so, l guess the default tries 10 is too little l advise to change it bigger , forexzample 15 ,maybe suitable.

zhajingwen commented 1 year ago

when l restart my test script

root@011b80959cd5:/workspace# python test.py
2023-02-28 11:42:26,687 - verify humman - INFO - human first successfully
2023-02-28 11:42:26,693 - verify humman - INFO - try to verify twice
2023-02-28 11:42:26,719 - verify humman - ERROR - not iframe button
2023-02-28 11:42:28,234 - verify humman - INFO - human first successfully
2023-02-28 11:42:28,239 - verify humman - INFO - try to verify twice
2023-02-28 11:42:28,247 - verify humman - ERROR - not iframe button
2023-02-28 11:42:29,761 - verify humman - INFO - human first successfully
2023-02-28 11:42:29,768 - verify humman - INFO - try to verify twice
2023-02-28 11:42:29,851 - verify humman - INFO - human twice click successfully
2023-02-28 11:42:31,363 - verify humman - INFO - human first successfully
2023-02-28 11:42:31,366 - verify humman - INFO - try to verify twice
2023-02-28 11:42:31,367 - verify humman - ERROR - twice without iframe
2023-02-28 11:42:32,895 - verify humman - INFO - human first successfully
2023-02-28 11:42:32,899 - verify humman - INFO - try to verify twice
2023-02-28 11:42:32,899 - verify humman - ERROR - twice without iframe
2023-02-28 11:42:34,491 - verify humman - INFO - human first successfully
2023-02-28 11:42:34,494 - verify humman - INFO - try to verify twice
2023-02-28 11:42:34,494 - verify humman - ERROR - twice without iframe
2023-02-28 11:42:36,007 - verify humman - INFO - human first successfully
2023-02-28 11:42:36,011 - verify humman - INFO - try to verify twice
2023-02-28 11:42:36,445 - verify humman - INFO - human twice click successfully
2023-02-28 11:42:37,958 - verify humman - INFO - human first successfully
2023-02-28 11:42:37,961 - verify humman - INFO - try to verify twice
2023-02-28 11:42:37,961 - verify humman - ERROR - twice without iframe
2023-02-28 11:42:39,474 - verify humman - INFO - human first successfully
2023-02-28 11:42:39,477 - verify humman - INFO - try to verify twice
2023-02-28 11:42:39,477 - verify humman - ERROR - twice without iframe
2023-02-28 11:42:40,990 - verify humman - INFO - human first successfully
2023-02-28 11:42:40,992 - verify humman - INFO - try to verify twice
2023-02-28 11:42:40,992 - verify humman - ERROR - twice without iframe
cf challenge fail
zhajingwen commented 1 year ago

l modify the default retries to 100 and rerun test script

2023-02-28 11:47:12,570 - verify humman - ERROR - twice without iframe
2023-02-28 11:47:12,570 - verify humman - INFO - 17
2023-02-28 11:47:14,898 - verify humman - ERROR - Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/cf_clearance/retry.py", line 17, in async_cf_retry
    success = False if await page.query_selector("#challenge-form") else True
  File "/usr/local/lib/python3.8/dist-packages/playwright/async_api/_generated.py", line 7864, in query_selector
    await self._impl_obj.query_selector(selector=selector, strict=strict)
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_page.py", line 352, in query_selector
    return await self._main_frame.query_selector(selector, strict)
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_frame.py", line 303, in query_selector
    await self._channel.send("querySelector", locals_to_params(locals()))
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_connection.py", line 44, in send
    return await self._connection.wrap_api_call(
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_connection.py", line 419, in wrap_api_call
    return await cb()
  File "/usr/local/lib/python3.8/dist-packages/playwright/_impl/_connection.py", line 79, in inner_send
    result = next(iter(done)).result()
playwright._impl._api_types.Error: Execution context was destroyed, most likely because of a navigation

2023-02-28 11:47:14,898 - verify humman - INFO - 16
2023-02-28 11:47:16,425 - verify humman - INFO - human first successfully
2023-02-28 11:47:16,428 - verify humman - INFO - try to verify twice
2023-02-28 11:47:16,428 - verify humman - ERROR - twice without iframe
ji3inpiz.AB_9eqgqiW9F.WKyvpeomq4RWI7hHc39PI-1677556033-0-250
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36
root@011b80959cd5:/workspace#

It retrys (100-16)=84 times then successfully

l think you should wait for the elements load then to click button, it always failed in twice verify,like below

2023-02-28 11:47:09,490 - verify humman - ERROR - twice without iframe
vvanglro commented 1 year ago

The number of retries can be customized, but I suggest not to set it too large. If you fail to pass the challenge after several retries, it is recommended to replace the elite proxy or try again.

Real person click verification appears randomly. At present, I have only collected the positioning of 2 click methods, that is, the first time and the second time you refer to. These two methods will not appear at the same time, and one of them will appear randomly each time.

zhajingwen commented 1 year ago

The number of retries can be customized, but I suggest not to set it too large. If you fail to pass the challenge after several retries, it is recommended to replace the elite proxy or try again.

Real person click verification appears randomly. At present, I have only collected the positioning of 2 click methods, that is, the first time and the second time you refer to. These two methods will not appear at the same time, and one of them will appear randomly each time.

when the first click done image

the page will jump to a newpage like below image this is a middle status when this status finnished running, the twice verify page will come,then the below code can get iframe

            iframe = await page.query_selector(
                "xpath=//div[@class='hcaptcha-box']/iframe"
            )

otherwise you will get None

so, l think we should give the page an additional time

zhajingwen commented 1 year ago

The number of retries can be customized, but I suggest not to set it too large. If you fail to pass the challenge after several retries, it is recommended to replace the elite proxy or try again. Real person click verification appears randomly. At present, I have only collected the positioning of 2 click methods, that is, the first time and the second time you refer to. These two methods will not appear at the same time, and one of them will appear randomly each time.

when the first click done image

the page will jump to a newpage like below image this is a middle status when this status finnished running, the twice verify page will come,then the below code can get iframe

            iframe = await page.query_selector(
                "xpath=//div[@class='hcaptcha-box']/iframe"
            )

otherwise you will get None

so, l think we should give the page an additional time

and like this image image

zhajingwen commented 1 year ago

The number of retries can be customized, but I suggest not to set it too large. If you fail to pass the challenge after several retries, it is recommended to replace the elite proxy or try again. Real person click verification appears randomly. At present, I have only collected the positioning of 2 click methods, that is, the first time and the second time you refer to. These two methods will not appear at the same time, and one of them will appear randomly each time.

when the first click done image the page will jump to a newpage like below image this is a middle status when this status finnished running, the twice verify page will come,then the below code can get iframe

            iframe = await page.query_selector(
                "xpath=//div[@class='hcaptcha-box']/iframe"
            )

otherwise you will get None so, l think we should give the page an additional time

and like this image image

add new one line code can solve this problem

            await page.wait_for_selector("xpath=//div[@class='hcaptcha-box']/iframe")
            iframe = await page.query_selector(
                "xpath=//div[@class='hcaptcha-box']/iframe"
            )
zhajingwen commented 1 year ago

The number of retries can be customized, but I suggest not to set it too large. If you fail to pass the challenge after several retries, it is recommended to replace the elite proxy or try again. Real person click verification appears randomly. At present, I have only collected the positioning of 2 click methods, that is, the first time and the second time you refer to. These two methods will not appear at the same time, and one of them will appear randomly each time.

when the first click done image the page will jump to a newpage like below image this is a middle status when this status finnished running, the twice verify page will come,then the below code can get iframe

            iframe = await page.query_selector(
                "xpath=//div[@class='hcaptcha-box']/iframe"
            )

otherwise you will get None so, l think we should give the page an additional time

and like this image image

add new one line code can solve this problem

            await page.wait_for_selector("xpath=//div[@class='hcaptcha-box']/iframe")
            iframe = await page.query_selector(
                "xpath=//div[@class='hcaptcha-box']/iframe"
            )

then there will be a new middle page jump out image

if it now loaded, the below code will get None image

vvanglro commented 1 year ago

Thanks for your research, I was unable to reproduce the situation you describe in repeated testing. The situation I tested is:

image

This click corresponds to :

click_button = page.query_selector("#challenge-stage > div > input")
if click_button:
    click_button.click()
image

This click corresponds to :

iframe = page.query_selector("xpath=//div[@class='hcaptcha-box']/iframe")
if iframe:
  iframe_button = iframe.content_frame().query_selector(
      "xpath=//*[@id='cf-stage']//label/span"
  )
  if iframe_button:
      iframe_button.click()

In my repeated tests, if the first case occurs after clicking, the second case will not appear, so there is no need to wait for this element, The first case will not appear again after the second case is clicked.

If both cases occur at the same time, it will also be clicked in the next loop.

vvanglro commented 1 year ago

l rewrite the retry moudle like below


# -*- coding: utf-8 -*-
import logging
from traceback import format_exc
from playwright.async_api import Error
from playwright.async_api import Page as AsyncPage
from playwright.sync_api import Page as SyncPage

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger('verify humman')

async def async_cf_retry(page: AsyncPage, tries: int = 10) -> bool:
    success = False
    while tries != 0:
        await page.wait_for_timeout(1500)
        try:
            success = False if await page.query_selector("#challenge-form") else True
            # 确认你是真人
            click_button = await page.query_selector("#challenge-stage > div > input")
            if click_button:
                await click_button.click()
            logger.info('human first successfully')

The first success log should be placed after the click.

click_button = await page.query_selector("#challenge-stage > div > input")
if click_button:
    await click_button.click()
    logger.info('human first successfully')
zhajingwen commented 1 year ago
"#challenge-stage > div > input"

Poor network and low server hardware configuration may cause this problem: slow loading

zhajingwen commented 1 year ago

In my repeated tests, if the first case occurs after clicking

and thank you l know what's the mean of click code , l will reorganize my ideas

zhajingwen commented 1 year ago

l rewrite the retry moudle like below

# -*- coding: utf-8 -*-
import logging
from traceback import format_exc
from playwright.async_api import Error
from playwright.async_api import Page as AsyncPage
from playwright.sync_api import Page as SyncPage

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger('verify humman')

async def async_cf_retry(page: AsyncPage, tries: int = 10) -> bool:
    success = False
    while tries != 0:
        await page.wait_for_timeout(1500)
        try:
            success = False if await page.query_selector("#challenge-form") else True
            # 确认你是真人
            click_button = await page.query_selector("#challenge-stage > div > input")
            if click_button:
                await click_button.click()
            logger.info('human first successfully')

The first success log should be placed after the click.

click_button = await page.query_selector("#challenge-stage > div > input")
if click_button:
    await click_button.click()
    logger.info('human first successfully')

yes , you are right

zhajingwen commented 1 year ago

l find that There may be two verification methods in total, and the appearance of the two verification methods is random. It is possible that 1 does not appear, and it is possible that both 1 and 2 appear

zhajingwen commented 1 year ago

image this case happened , l guess the page not loaded then the code click_button = await page.query_selector("#challenge-stage > div > input") have run ; so, click_button is None; there is a method to solve this case like below code

            success = False if await page.query_selector("#challenge-form") else True
            # 第一次验证
            try:
                # wait page loaded otherwise you will get None
                await page.wait_for_selector("#challenge-stage > div > input")
            except:
                await page.screenshot(path=f'{int(time.time())}no first verify page.png')
                logger.error('no waited first verify')
            # 确认你是真人
            click_button = await page.query_selector("#challenge-stage > div > input")
            if click_button:
                await click_button.click()
                logger.info('human first successfully')
            else:
                logger.info('first verify button not loaded')

image

zhajingwen commented 1 year ago

offcourse, add this line code await page.wait_for_selector may slow down the whole process of verification

zhajingwen commented 1 year ago

l have a question why not add a line code to modify the success status value ? l think if the click done ,all the verify process have done if add the new line will break the loop, complete verification image

vvanglro commented 1 year ago

You can customize the retry method in your code.

zhajingwen commented 1 year ago

method in your code.

OK,thank you verymuch