sarperavci / CloudflareBypassForScraping

A cloudflare verification bypass script for webscraping
397 stars 72 forks source link

I have tested this but its not working for gdtot site #29

Open Gujjugaming2k opened 4 weeks ago

Gujjugaming2k commented 4 weeks ago

I have tested this but its not working

I tried to do bypass using you code but nothing happen its showing bypass successful but not done in actual.

below are the test link https://new5.gdtot.dad/file/7340685321

sarperavci commented 4 weeks ago

Hello, please check the lines 67 and 68 https://github.com/sarperavci/CloudflareBypassForScraping/blob/main/test.py

So modify your code from:

        logging.info('Starting Cloudflare bypass.')
        cf_bypasser = CloudflareBypasser(driver)

        # If you are solving an in-page captcha (like the one here: https://seleniumbase.io/apps/turnstile), use cf_bypasser.click_verification_button() directly instead of cf_bypasser.bypass().
        # It will automatically locate the button and click it. Do your own check if needed.

        cf_bypasser.bypass()

To:

        logging.info('Starting Cloudflare bypass.')
        cf_bypasser = CloudflareBypasser(driver)

        # If you are solving an in-page captcha (like the one here: https://seleniumbase.io/apps/turnstile), use cf_bypasser.click_verification_button() directly instead of cf_bypasser.bypass().
        # It will automatically locate the button and click it. Do your own check if needed.

        cf_bypasser.click_verification_button() 
Gujjugaming2k commented 4 weeks ago

great able to do that, any idea about save html I mean to say,

save html page before close the browser

sarperavci commented 4 weeks ago

great able to do that, any idea about save html I mean to say,

save html page before close the browser

Check this to get html content https://github.com/sarperavci/CloudflareBypassForScraping/issues/2#issuecomment-1963708098

Gujjugaming2k commented 4 weeks ago

Thank you man, can you please help last thing ? fetch onclick (Button) value from page

I want to extract "https://t.me/gdbot3_bot?start=v4w4a4a4EwKOf5D4KoGfDfw4" from above page, I have bypass cloudflare but after that we are not able to extract info, below are the html tag from that I want to fetch onclick value

"button onclick="myDl2('https://t.me/gdbot3_bot?start=v4w4a4a4EwKOf5D4KoGfDfw4')" id="dirdown" type="button" class="btn btn-outline-light btn-user font-weight-bold"> Telegram Download</button"

Gujjugaming2k commented 4 weeks ago

Done found the solution, Close now

Gujjugaming2k commented 4 weeks ago

Reopening the issue with same code, I have tried with HEADLESS = true due tot this we are not able to bypass the cloudflare in log its showing, but when I extracted html data its still showing solve the captcha let us know anything else required for linux os?

INFO - Starting Cloudflare bypass. Verification button found. Attempting to click.

Gujjugaming2k commented 4 weeks ago

I have taken screenshot also before verification and after verification

As I checked cloudflare is loaded successfully before verification but once its going to click its showing error having trouble in cloudflare Added both screenshot as well

After_CF Before_CF

Ash-Olorenshaw commented 3 weeks ago

Reopening the issue with same code, I have tried with HEADLESS = true due tot this we are not able to bypass the cloudflare in log its showing, but when I extracted html data its still showing solve the captcha let us know anything else required for linux os?

INFO - Starting Cloudflare bypass. Verification button found. Attempting to click.

Cloudflare usually automatically blocks headless browsers. You'll probably need to run the browser nonheadless through pyvirtualdisplay.