toland / patron

Ruby HTTP client based on libcurl
http://toland.github.com/patron/
MIT License
541 stars 74 forks source link

Redirect handling inconsistency #153

Closed zverok closed 6 years ago

zverok commented 6 years ago
require 'patron'
require 'typhoeus'

# patron:
resp = Patron::Session.new.get('https://instagram.com/p/Bd961ykHOUt/')
resp.headers
# => {"Location"=>"https://www.instagram.com/p/Bd961ykHOUt/", "Content-Type"=>"text/plain", "Server"=>"proxygen", "Date"=>"Wed, 17 Jan 2018 12:24
# So, I should handle redirects manually?..
resp.body.size
# => 34510 
# ...but then, why body is here?.. 
# So, it HAD followed the redirects

# patron with no redirects
resp = Patron::Session.new(max_redirects: 0).get('https://instagram.com/p/Bd961ykHOUt/')
resp.body.size
# => 0
resp.headers
# => {"Location"=>"https://www.instagram.com/p/Bd961ykHOUt/", "Content-Type"=>"text/plain", "Server"=>"proxygen", "Date"=>"Wed, 17 Jan 2018 12:34:02 GMT", "Connection"=>"keep-alive", "Content-Length"=>"0"} 
# ^ EXACTLY same as above

# typhoeus
resp = Typhoeus.get('https://instagram.com/p/Bd961ykHOUt/')
resp.headers
# => {"Location"=>"https://www.instagram.com/p/Bd961ykHOUt/", "Content-Type"=>"text/plain", "Server"=>"proxygen", "Date"=>"Wed, 17 Jan 2018 12:31:44 GMT", "Connection"=>"keep-alive", "Content-Length"=>"0"} 
resp.body.size
# => 0 

# typhoeus following redirects
resp = Typhoeus.get('https://instagram.com/p/Bd961ykHOUt/', followlocation: true)
resp.headers
# => {"Content-Type"=>"text/html", "X-Frame-Options"=>"SAMEORIGIN", "Vary"=>"Cookie, Accept-Language, Accept-Encoding", "Cache-Control"=>"private, no-cache, no-store, must-revalidate", "Pragma"=>"no-cache", "Expires"=>"Sat, 01 Jan 2000 00:00:00 GMT", "Content-Language"=>"en", "Date"=>"Wed, 17 Jan 2018 12:32:56 GMT", "Strict-Transport-Security"=>"max-age=86400", "Set-Cookie"=>["csrftoken=fq87p3wpdwOCLC0kZwaRayeQfAHLiGC0; expires=Wed, 16-Jan-2019 12:32:56 GMT; Max-Age=31449600; Path=/; Secure", "rur=ATN; Path=/", "mid=Wl9CeAAEAAHGFbJsMnz9-1PAOcNw; expires=Tue, 12-Jan-2038 12:32:56 GMT; Max-Age=630720000; Path=/", "urlgen=\"{\\\"time\\\": 1516192376\\054 \\\"94.179.93.165\\\": 6849}:1ebmtk:of6d9rD6ldNEohWdZt9dkjcJCqo\"; Path=/"], "Connection"=>"keep-alive", "Content-Length"=>"34584"} 
resp.body.size
# => 34584 

So, the problem is: when redirect is followed, response still has headers & status of the first URL fetched.

julik commented 6 years ago

@zverok ohai, haven't heard from you in ages ;-) Thanks for reporting, I'll take care of this.

zverok commented 6 years ago

@zverok ohai, haven't heard from you in ages

http://tvtropes.org/pmwiki/pmwiki.php/Main/WereStillRelevantDammit :trollface:

julik commented 6 years ago

Closed via #154