tenox7 / wrp

Web Rendering Proxy: Use vintage, historical, legacy browsers on modern web
Apache License 2.0
1.07k stars 51 forks source link

Some method of properly passing Cloudflare protections? #110

Open Kadigan opened 1 year ago

Kadigan commented 1 year ago

Hey.

So Cloudflare is everywhere now (enough to bring Discord down when they furk up), and it seems WRP doesn't know how to work with it. I can't seem to be able to pass Cloudflare's "verify you are human", and it seems to be used more and more aggressively.

Any suggestions?

TheTechRobo commented 1 year ago

Maybe cloudflare is detecting the headless Chrome?

Kadigan commented 1 year ago

The particular website (fanfiction.net) is actually known for being very aggressive about it (or maybe Cloudflare is particularly aggressive about that website? who knows). All I know is that I see 30-50 "checking the security of the connection" pages on an average day (and some days - at every page load), using a regular browser.

I don't think hiding the headless Chrome better will fix it. There has to be a way to pass this check.

TheTechRobo commented 1 year ago

Headless Chrome is often used for data scraping. Websites like Cloudflare block it more aggressively.

What exactly happens when you go to the page? Is there an "I'm not a robot" checkbox?

tenox7 commented 1 year ago

yeah crap, we need to come up with some solution for this

Kadigan commented 1 year ago

@TheTechRobo Yes, there is. It does the whole song & dance, and then redirects. Assuming the check works, the "browser" would need to be able to navigate, and/or send screen updates or something.

tenox7 commented 10 months ago

do you have any handy examples of pages that fail this?

Kadigan commented 10 months ago

Anything on fanfiction.net should qualify. They pursue verification so aggressively, that I have to go through the verification almost every other chapter on my desktop. It happens most often when switching chapters using the prev/next buttons.

That was actually the website I was hoping to use it with, as previously mentioned.

tenox7 commented 10 months ago

🤷 I have released 4.6.2; it has some basic anti crawler detection bits, also allows setting user agent with a flag; this seems to help the most;

I have tried ./wrp -ua="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" and it seems better