Closed doutatsu closed 2 years ago
Hi @doutatsu, I believe the confusion here is that FollowRedirects
only follows "hard" redirects. That's when the server returns one of the 3xx status codes together with a Location
header indicating the redirect location (see https://github.com/lostisland/faraday_middleware/blob/main/lib/faraday_middleware/response/follow_redirects.rb#L7).
To my knowledge, those pages you're referring to work differently. The response in that case is a 200 status containing an HTML body (hence why you can see the page on the browser), which normally redirect you to the right page after a few seconds. This uses either a meta
tag or a JS snippet to accomplish the redirect.
I'm afraid in such case you'd need to parse the response body (HTML/JS) and look for the redirection url, and this is something outside of the scope of the FollowRedirects
middleware.
There might be another middleware accomplishing what you need (though I've never encountered it), or you might write your own middleware and reuse some existing gem that does the heavy lifting in the implementation.
Please let me know if the above helps and if you have any further question!
Thanks for a detailed response @iMacTia - I was suspecting this, but wanted to double-check. I'll need to see if there is anything in the response with the URL, but I suspect they redirect using JS, which won't work with the plain HTTP request
Glad that was at least helpful and sorry for not being able to provide an actual solution. I'll close the issue for now but please, should you find a way forward, could you kindly share it here as well? I'm sure other people in future may find it useful 💪.
I'm obviously available to answer any further question as well, especially if you decide to go down the road of writing a middleware 🙌!
I've been trying to make use of the
cookie_jar
+follow-redirects
middlewares, to bypass DDoS Protection, whatever Cloudflare or otherwise. As it's not captcha protection, I thought it would be possible to just follow the redirect provided, as mentioned in the 403 response:This process is automatic. Your browser will redirect to your requested content shortly.<br>Please allow up to 5 seconds...
I am not too familiar with the process, but I wanted to see if it is something that's possible to achieve? Or do I need a JS-enabled browser to make a request and simple HTTP requests won't be able to get a redirect back?
Here is my configuration: