FriendsOfPHP / Goutte

Goutte, a simple PHP Web Scraper
MIT License
9.26k stars 1.01k forks source link

Is there a reason why goutte does not let guzzle handle redirects? #273

Closed georaldc closed 8 years ago

georaldc commented 8 years ago

The reason I ask is because I'm running into a site that has a 302 location value consisting only of query string parameters. It looks like browserkit just appends this to the parsed host name from the URI, redirecting me to the wrong location. To make things clearer:

URL being accessed: http://website.com/foo/bar -> 302 redirects to "?test=123"

Results using browserkit, guzzle with redirects, and a regular browser Browserkit: http://website.com?test=123 Guzzle: http://website.com/foo/bar?test=123 Firefox: http://website.com/foo/bar?test=123

Guzzle and firefox results are what I would have expected. Any way to get browserkit to follow redirects the same way?

georaldc commented 8 years ago

Made a mistake, it doesn't just append it to the host name but it strips out the leading path value if it does not end with a forward slash and appends the query string there. So basically:

Browserkit: http://website.com/foo/?test=123

Still not what I expected though. I sort of found why this happens and opened up an issue about it on the Symfony repo (https://github.com/symfony/symfony/issues/19303).

georaldc commented 8 years ago

This can be closed now. Related fix has been merged into the symfony repo