Open adonig opened 6 months ago
Sorry, I don't understand the idea behind this change. Could you please explain?
It allows you to use a different normalization function without having to replace the whole UniqueRequest middleware. For instance, while some search engines consider URLs with and without a trailing slash as identical, others do not. Some go even further and treat /index.html the same way. Some transform the host part of the URL to lowercase and some even sort the query string parameters alphanumerically or resolve relative paths. For example I use this normalization function:
def normalize_url(url) do
parsed = URI.parse(url)
if parsed.scheme in ["http", "https"] and parsed.host do
%URI{
scheme: parsed.scheme,
host: String.downcase(parsed.host),
path: parsed.path,
query: parsed.query
}
|> URI.to_string()
else
nil
end
end
Section 6 of RFC 3986 goes a bit deeper into the topic of URL normalization.
I found out that Erlang comes with a RFC 3986-compliant URL normalization function: :uri_string.normalize/1
I believe it's still a good idea to allow people to provide their own implementation, because some might want to extend the behavior of the RFC, like for example Cloudflare or Kaspersky do.
This parameter allows to customize the normalization behavior using a unary normalization function.