Closed dlecocq closed 8 years ago
LGTM
The url-py
behavior is such that relative
comes with an implicit abspath
(in fact, it's urljoin
that does this operation, so url-py
had no control over it). In order to keep the expectations of relative
and relative_to
the same, url-py
adds in an extra abspath
implicit to relative_to
(https://github.com/seomoz/url-py/blob/0215c7ca698c3f9bcbb7f6a5c3e290264a8bb784/url/url.pyx#L159)
However, these last few commits are to address the fact that the existing url-cpp
implementation would erroneously promote path
-> /path
. We didn't witness this before because we've predominately focused on what the URL looks like when turned back into a string, which hides this issue. But, witness (while trying to update the url-cpp
submodule in url-py
):
>>> URL.parse('path').abspath().path
'/path'
>>> URL.parse('path').abspath().utf8
'/path'
Or, in the test case I was working on:
>>> URL.parse('javascript:console.log("hello")').relative_to('http://moz.com/').utf8
'javascript:/console.log("hello")'
@b4hand @neilmb -- do you want to take another look at the last four commits?
LGTM
This brings
url-cpp
closer to the behavior of Python'surlparse
, in the handling of three classes of schemes:netloc
params
Before this PR:
With this PR (same as
urlparse
interpretation):@b4hand @neilmb @lindseyreno