Closed qsantos closed 1 year ago
You reference an RFC. This crate explicitly only implements the URL Standard.
According to the reference URL parser m:/.//\\
should parse successfully
My bad. Thanks for the proper reference. Then, it looks like this comes from the serialization part, more specifically, from section 4.5, item 3, whose note explicitly mentions the case we are discussing. I have correspondingly moved the correction to the serialization.
Rebased to take into account the new tests
Patch coverage: 83.87
% and project coverage change: -0.02
:warning:
Comparison is base (
edeaea7
) 82.74% compared to head (fd29f5f
) 82.73%.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
I have moved the handling of empty fragment to with_query_and_fragment
. That way, it should be able to normalize URLs whether they are a fresh parse, or a combination of an existing URL and a relative path.
Thanks for the reactivity and the detailed comments!
This PR closes https://github.com/servo/rust-url/issues/799 by refusing to parse URLs with no authority whose normalized path would start with a double-slash.
The issue comes from the remove dot segments step. Let's consider the URI
m:/.//
. Then, according to the RFC:So it should be normalized to
m://
, but this has different semantics (resulting in\
being interpreted as being part of the authority in the original example).I have conducted a few tests with some URI normalization libraries:
For PHP, I am using https://github.com/glenscott/url-normalizer. In short:
rust-url
I feel like Ruby's is the most consistent and straightforward solution.
This does mean that the
no_panic
test from https://github.com/servo/rust-url/issues/654 must be amended. However, we can cover them:/.//
case in a dedicated unit test.