aio-libs / yarl

Yet another URL library
https://yarl.aio-libs.org
Apache License 2.0
1.33k stars 167 forks source link

Over-zealous empty query/fragment normalization like bpo-37969 #332

Open nicktimko opened 5 years ago

nicktimko commented 5 years ago

I saw https://bugs.python.org/issue37969 and was curious how YARL handles an empty querystring given that RFC 3986 § 6.2.3 says

Normalization should not remove delimiters when their associated component is empty unless licensed to do so by the scheme specification. For example, the URI "http://example.com/?" cannot be assumed to be equivalent to any of the examples ["http://example.com", "http://example.com/", "http://example.com:/", "http://example.com:80/"].

† Is there a separate spec for HTTP that says you can? Would this impact a generic URL parser anyways?

>>> yarl.URL("http://example.com/?#") == yarl.URL("http://example.com/")
True

And I can't see a way to yarl.URL.build to end up with a blank (but delimited) querystring/fragment

Wondering if this would just be a 'won't-fix' in YARL, or if there's some clean design solution. Also if any URL-savants wanted to render an opinion in the bugs.python.org issue.


⁂ even more niche "issue":

>>> yarl.URL("http://example.com:80/") == yarl.URL("http://example.com:/")
False
bdraco commented 2 months ago

We use urlsplit from standard lib so we likely need to wait for https://github.com/python/cpython/issues/99962

bdraco commented 2 months ago

⁂ even more niche "issue":

>>> yarl.URL("http://example.com:80/") == yarl.URL("http://example.com:/")
False

We don't do any port normalizing in __eq__ or __hash__

1033 did a bit a work on that effort, but I'm not sure if we should be normalizing in __eq__ or __hash__ as it might be considered a breaking change.