Athlon1600 / php-proxy

A web proxy script written in PHP and built as an alternative to Glype.
https://www.php-proxy.com
MIT License
295 stars 158 forks source link

Some urls / domains get redirected #137

Open Collusion opened 6 years ago

Collusion commented 6 years ago

Example domain: http://superbestaudiofriends.org/index.php

If I try to open any of the subforums via the proxified page, I get back to the front page ( the url above ). I've tried to debug this myself, but haven't had any luck.

Example: http://superbestaudiofriends.org/index.php?threads/what-are-you-listening-to-right-now.6/page-254

At ProxifyPlugin.php in function onBeforeRequest(ProxyEvent $event):

$request = $event['request'];

If I dump the $request object, I get the following output:

object(Proxy\Http\Request)#18 (10) { ["method":"Proxy\Http\Request":private]=> string(3) "GET" ["url":"Proxy\Http\Request":private]=> string(42) "http://superbestaudiofriends.org/index.php" ["protocol_version":"Proxy\Http\Request":private]=> string(3) "1.1" ["params"]=> object(Proxy\Http\ParamStore)#19 (2) { ["data":protected]=> array(0) { } ["case_sensitive":protected]=> bool(false) } ....

So for some reason, unknown to me, the url changes on the way. This is so far the only domain I've encountered this issue with, but as always, there are bound to be more...

Athlon1600 commented 6 years ago

The issue is with the way we are parsing that forum URL:

use Proxy\Http\Request;

echo '<pre>';

$url = "http://superbestaudiofriends.org/index.php?threads/what-are-you-listening-to-right-now.6/page-254";
var_dump($url);

$request = new Request('GET', 'url.com');
$request->setUrl($url);

var_dump($request->getUri());

which outputs:

string(97) "http://superbestaudiofriends.org/index.php?threads/what-are-you-listening-to-right-now.6/page-254"
string(102) "http://superbestaudiofriends.org/index.php?threads%2Fwhat-are-you-listening-to-right-now_6%2Fpage-254="

The http_build_query function from here:
https://github.com/Athlon1600/php-proxy/blob/master/src/Http/Request.php#L172

escapes all the slashes and dots from this portion of that URL:

forums/tales-from-the-bully-pulpit.2/

and transforms it into something that super-best-audio-friends server now treats as a whole different URL. The solution is not as easy as I thought it was going to be. Will look more into it this weekend.

Collusion commented 6 years ago

I think I bumped into the same problem with this site:

https://headmania.org/2015/08/18/schiit-yggdrasil-dac-review/

The html loads fine, but two css files fail:

https://s1.wp.com/_static/??-eJyNkVtuAyEMRTdUF00b9fFRdS0M4xInGCPwKMru6wmK1KQV6g/yNff4Ae5UIEhWzOp4hZLWSLm5UwnC0JgSnu/UY2jtwf2NJTpicwfU4sMRLmpkr5i84gJFmt6pERakouW5eN0cjAt5TMhmG2FcXq7UFu5txGGbvvU8l4qtgZ1MK4PurdFvrqddWWenlVSyba/omp4T/tdMOXQALq2Hs9ES0d6sSSCfgMxyKzpMLov2y2swqhpRIEnwSpJvBHwlT3X8lXOSaGF05vohN+iTP6bd++vT2zQ9T4dv7wrr+w==?cssminify=yes

https://s2.wp.com/_static/??-eJx9i0EKAjEMRS9kDYUZGRfiWTKltpE0KU0Grz+4EBHF1X8f3oNHD0nFszi0LXTeCokBJieVFcebjsnsAL/1rubhxkgDrOIgKa/9V5kmQg6sRT/PV+Q1t2xQJyisK/JTuLZLnOPpvMxxWu47xX5JvA==?cssminify=yes

The URL gets modified along the way and what I get is a 400 response for those two files.