codemanki / cloudscraper

--DEPRECATED -- 🛑 🛑 Node.js library to bypass cloudflare's anti-ddos page
MIT License
602 stars 139 forks source link

Bypass IP / location lock of CloudFlare? #294

Closed minas90 closed 4 years ago

minas90 commented 4 years ago

Right now CloudFlare is active for the following website: https://censor.net.ua/ CloudScraper is working fine when I run it locally, but on ec2 instance in Frankfurt I'm getting error with message captcha.

When I do wget https://censor.net.ua locally I get ERROR 503: Service Temporarily Unavailable. On the ec2 instance I get ERROR 403: Forbidden.

My guess is that CloudFlare is blocking requests from certain locations or IP addresses, maybe when it's configured like that by the website owner. Any ideas how to fix that?

codemanki commented 4 years ago

@minas90 hey.

My guess is that CloudFlare is blocking requests from certain locations or IP addresses

I think this is a correct guess, given that it works on your local machine and you get 503 using wget. Probably your instance' IP address got blacklisted. I don't think there is a way to fix. Try rotating IP address.

minas90 commented 4 years ago

Thanks for quick response! I created a new ec2 instance and now I'm getting 503 with wget. The weird thing is that now I'm getting another error from CloudScraper: 404 - "body of 404 from the website"

This is the exact resource URL I'm trying to get: https://censor.net.ua/includes/news_uk.xml

Any ideas how it can happen?

codemanki commented 4 years ago

@minas90 I quickly looked into this, and it seems that if you do pure curl|wget without any headers, cloudflare responds with 503. But indeed, i get 404 for cloudscraper calls 🤔 I think that it might be related to incorrect headers or maybe referrer?

codemanki commented 4 years ago

Oh, i think i see what is happening. Censor returns this in headers:

        href: 'https://m.censor.net.ua/includes/news_uk.xml',
        ntick: true,
        response: [Circular],
        originalHost: 'm.censor.net.ua',

I think they detect UA as mobile browser and redirect you to their m. subdomain for some reason

codemanki commented 4 years ago

@minas90 try running this code

cloudscraper('https://censor.net.ua/includes/news_uk.xml', {headers: {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
}}).then((htmlStr) => {
  console.log(htmlStr)
}).catch((err) => {
  console.log(err)
})
minas90 commented 4 years ago

It worked, thanks a lot! So strange that locally it works without the user-agent.