lo48576 / iri-string

String types for URIs/IRIs.
Apache License 2.0
15 stars 3 forks source link

WHATWG URL Standard seems inconsistent to RFC 3986 when authority is absent and the path is relative #29

Closed lo48576 closed 2 years ago

lo48576 commented 2 years ago

After a bit of investigation, I notice that the WHATWG URL Standard treats the rootless path in foo:.///bar as opaque and does not require any dot segments to be removed from it: 2.8, 2.9 of scheme state; Live URL Viewer. So, I believe that the WHATWG spec is irrelevant when it comes to whether to preserve the relativity of paths or not, which brings us back to RFC 3986.

--- https://github.com/lo48576/iri-string/issues/28#issuecomment-1222402924

The primary goal of the string types in iri-string crates is being compliant to RFC 3986 and RFC 3987, and then I attempt to use WHATWG URL Standard supplementarily to deal with incompleteness of RFC 3986/3987. So foo:./bar being normalized into foo:bar is expected result (according to RFC 3986) while WHATWG URL Standard requires the path to be treated as "opaque".

However, foo:.///bar/./baz being normalized into foo:/.//bar/baz is compliant to neither RFC 3986 nor WHATWG URL Standard. (RFC 3986 will (erronously) expect foo://bar/baz, and WHATWG URL Standard requires foo:.///bar/./baz as reported above.)

Result according to WHATWG URL Standard foo:.///bar/./baz does not seem to be consistent to expected result according to RFC 3986 (the ideal path //bar/baz), so I should clearly define the behavior in this case and make an effort to make behavior consistent.

lo48576 commented 2 years ago

Adding an option to treat relative path as "opaque" will solve this problem and might be useful in other contexts?