Closed Chris00 closed 9 years ago
I believe Uri.(resolve "" uri (of_string ""))
should do the normalization you require. It will handle scheme, host, path, and encoding normalization. This (like many behaviors) should be better documented and perhaps should be exposed directly. I'm leaving this open to track that.
Please, let me know if you need different normalization than that provided by resolve
or need clarification about what exactly it does.
Sorry, it does not work. One needs to provide the canonical form of the Uri.t
so comparison amounts to String.compare
. Uri already remove any case in the hostname (a good thing). Section 6.2.3 says that an empty path should be normalized to a path of "/". Maybe the Uri.compare
needs some improvements too in view of:
# Uri.compare (Uri.of_string "http://x.y") (Uri.of_string "http://x.y/");;
- : int = -1
The section also says that an explicit ":port", for which the port is empty or the default for the scheme, is equivalent to one where the port and its ":" delimiter are elided and thus should be removed by scheme-based normalization. and thus
http://example.com
http://example.com/
http://example.com:/
http://example.com:80/
are equivalent (and likewise for https
,...).
I guess other normalizations should also be performed: for example http://x.y/
is equivalent to http://x.y/?
,... The devil is in not forgetting any...
The failure to elide the port is a bug. The failure to normalize a missing path to the root path may also be a bug but is somehow less obvious to me (e.g. the comparison example). Your query string example is not a valid normalization as it changes bytes-on-the-wire and servers are free to interpret the ?
and everything after it as they see fit.
Improving normalization is definitely an important issue that we should work on.
http://validator.w3.org/feed/ says that URI used as identifiers should be in canonical form, as described by section 6 of RFC 3986. It would be great if Uri provided a function
that would return the canonical form. A function
would be as good for my needs.