omkarcloud / botasaurus

The All in One Framework to build Awesome Scrapers.
https://www.omkar.cloud/botasaurus/
MIT License
1.35k stars 123 forks source link

Cloudlfare bypass doesnt work #81

Closed kreethandsouza closed 4 months ago

kreethandsouza commented 6 months ago

I just updated the package to latest version. before it was 3.2.7 and it worked fine which bypassed 9/10. but now it is bypassing only 1/10. not able to bypass cloudflare challenger

Chetan11-dev commented 6 months ago

Yes, the moment we connect to Chrome via Selenium, we are detected due to selenium automation noise.

kreethandsouza commented 6 months ago

@Chetan11-dev I am actually pretty new in this selenium but still i am trying my level best to do it. What i did find is that there is a package called "DrissionPage" which bypasses the cloudflare. I am adding the repository here: https://github.com/g1879/DrissionPage the problem is that the package bypasses cloudflare only in HeadFull mode not headless. If you could understand how its underlaying layer works and integrate it in your package in headless mode, it would solve the problem of not bypassing cloudlfare in your package. You have done a spelndid work on "botasaurus". Still trying to understand the package. Thanks.

Chetan11-dev commented 6 months ago

I will look into in, Once I get the Time. Also, does it bypass the captcha version as well when visiting Cloudflare Captcha Page (https://www.g2.com/products/jenkins/reviews?page=5).

kreethandsouza commented 6 months ago

@Chetan11-dev As the ammount i have tried, it didn't. I have changed proxies as well as headers but still didnt. it was working fine and bypassing till some point but recently it stopped.

Chetan11-dev commented 6 months ago

So, I guess a JS based Chrome Extension Solution should be more reliable, I am busy now, but will look into it in future.

kreethandsouza commented 6 months ago

@Chetan11-dev Do you know that JS based chrome extension? it would be really useful for me. Thanks.

luckybk93 commented 5 months ago

So, I guess a JS based Chrome Extension Solution should be more reliable, I am busy now, but will look into it in future.

Can you give us some suggestions?

luckybk93 commented 5 months ago

I'm thinking of a solution, using an extension to control the browser instead of selenium. But I don't know where to start

Chetan11-dev commented 5 months ago

I will look into it once I am available. I will attempt to resolve it next month. So kindly check back at end of May.

luckybk93 commented 5 months ago

I will look into it once I am available. I will attempt to resolve it next month. So kindly check back at end of May.

Thanks! I'll wait

Chetan11-dev commented 4 months ago

Please run the following commands:

python -m pip install bota botasaurus_api botasaurus_driver bota botasaurus-proxy-authentication botasaurus_server --upgrade

And read the documentation at https://github.com/omkarcloud/botasaurus.