Closed danielbeardsley closed 6 years ago
Question why are you url encoding? The class does it for you. So don't use rawurlencode and you should be fine
why are you url encoding?
The docs didn't mention that the class would do it for me.
don't use rawurlencode and you should be fine
The library seems to try to guess if the content I'm giving it has already been encoded or not. If the content includes bits that resemble encoded characters, it can guess wrong and produce incorrect results.
When I drop the rawurlencode()
from above, I end up with the same incorrect output: http://blah.com/hi%20
. It seems to interpret the %20
as an already encoded space and then doesn't encode it further. This means there's no way to generate the valid url: http://blah.com/hi%2520
Actually the problem is somewhat deeper than that; the library actively mutates valid urls and changes their content:
$uri = League\Uri\Schemes\Http::createFromString("http://blah.com/hi%2520");
echo $result;
Produces:
http://blah.com/hi%20
Interestingly, it seems it parses it and runs each segment through the decoder up to two times:
$uri = League\Uri\Schemes\Http::createFromString("http://blah.com/%25252520/%252520/%2520/%20/ /");
echo $result;
Produces:
http://blah.com/%2520/%20/%20/%20/%20/
@danielbeardsley I've found the bug it's in another component and quite frankly I'm in a process of killing this uri-manipulations library as it's just a wrapper around another component where the bug is uri-components. So in the next major release this package will be killed/removed.
The library seems to try to guess if the content I'm giving it has already been encoded or not. If the content includes bits that resemble encoded characters, it can guess wrong and produce incorrect results.
Yes it does because the goal is to avoid double encoding like in PSR-7 UriInterface::withPath
method which accepts both encoded and non encoded data and returns the correct path with the correct encoding.
When I get enough time I'll tackle this bug in the uri-component package and this package will have the bug automatically fixed.
a bug fix has landed on the master branch of the URI Component package. the two reported bugs are fixed. the bug will be fixed one this package once I release a bug fix release of that component package (ie: 1.8.3).
bug fix is in prod with league/uri-components 1.8.2 being released
Issue summary
AppendSegment can incorrectly encode / decode if they include an encoded
%
sign. We should be able to url-encode any valid ascii characters, and include them in a url such that the server (after decoding) gets the same characters out.System information
Standalone code
Expected result
Actual result
Note: if the above
$path
washi%11
the Library would throw an Exception.