lostisland / faraday_middleware

Various Faraday middlewares for Faraday-based API wrappers
MIT License
557 stars 203 forks source link

How to follow redirects prompted by DDoS Guards #277

Closed doutatsu closed 2 years ago

doutatsu commented 2 years ago

I've been trying to make use of the cookie_jar + follow-redirects middlewares, to bypass DDoS Protection, whatever Cloudflare or otherwise. As it's not captcha protection, I thought it would be possible to just follow the redirect provided, as mentioned in the 403 response: This process is automatic. Your browser will redirect to your requested content shortly.<br>Please allow up to 5 seconds...

I am not too familiar with the process, but I wanted to see if it is something that's possible to achieve? Or do I need a JS-enabled browser to make a request and simple HTTP requests won't be able to get a redirect back?

Here is my configuration:

        Faraday.new(url: connection_url, request: { timeout: 6 }) do |f|
          f.use FaradayMiddleware::FollowRedirects, cookies: :all
          f.use :cookie_jar
          f.request :url_encoded
          f.request :retry, {
            max: 2,
            interval: 1,
            interval_randomness: 0.5,
            backoff_factor: 2,
            retry_statuses: [429, 403],
            exceptions: [
              Faraday::RetriableResponse,
              Faraday::TimeoutError,
              Faraday::ConnectionFailed
            ],
            methods: %i[get post]
          }
        end
iMacTia commented 2 years ago

Hi @doutatsu, I believe the confusion here is that FollowRedirects only follows "hard" redirects. That's when the server returns one of the 3xx status codes together with a Location header indicating the redirect location (see https://github.com/lostisland/faraday_middleware/blob/main/lib/faraday_middleware/response/follow_redirects.rb#L7).

To my knowledge, those pages you're referring to work differently. The response in that case is a 200 status containing an HTML body (hence why you can see the page on the browser), which normally redirect you to the right page after a few seconds. This uses either a meta tag or a JS snippet to accomplish the redirect.

I'm afraid in such case you'd need to parse the response body (HTML/JS) and look for the redirection url, and this is something outside of the scope of the FollowRedirects middleware. There might be another middleware accomplishing what you need (though I've never encountered it), or you might write your own middleware and reuse some existing gem that does the heavy lifting in the implementation.

Please let me know if the above helps and if you have any further question!

doutatsu commented 2 years ago

Thanks for a detailed response @iMacTia - I was suspecting this, but wanted to double-check. I'll need to see if there is anything in the response with the URL, but I suspect they redirect using JS, which won't work with the plain HTTP request

iMacTia commented 2 years ago

Glad that was at least helpful and sorry for not being able to provide an actual solution. I'll close the issue for now but please, should you find a way forward, could you kindly share it here as well? I'm sure other people in future may find it useful 💪.

I'm obviously available to answer any further question as well, especially if you decide to go down the road of writing a middleware 🙌!