sindresorhus / normalize-url

Normalize a URL
MIT License
840 stars 122 forks source link

Fix encoding consistency with sorted and non-sorted URL parameters #158

Closed ludofischer closed 2 years ago

ludofischer commented 3 years ago

Fix #149

Currently normalize-url encodes URL parameters when sortQueryParameters is true and leaves them as they are when not. This behaviour is confusing as a user would not expect sorting the parameters to affect the encoding. This PR tries to ensure the search parameters are always decoded, since normalize-url already always decodes the pathname and does not intend to sanitize the URL; https://github.com/sindresorhus/normalize-url/blob/6ea4038783d298d499583db07a820e5ca8ff2721/index.js#L160

sindresorhus commented 2 years ago

This really sounds like a WHATWG URL bug. It should always be consistent about encoding whether or not .sort() is called. If you can only reproduce this in Node.js, I recommend opening an issue on https://github.com/nodejs/node, otherwise https://github.com/whatwg/url

ludofischer commented 2 years ago

This really sounds like a WHATWG URL bug. It should always be consistent about encoding whether or not .sort() is called. If you can only reproduce this in Node.js, I recommend opening an issue on https://github.com/nodejs/node, otherwise https://github.com/whatwg/url

I can reproduce it in Firefox and Chrome too! And it seems the issue is well known: https://github.com/whatwg/url/issues/491:

And any attempts to manipulate it will change the contents of your URL's query string in unintended ways

ludofischer commented 2 years ago

See also https://github.com/whatwg/url/issues/478 where they decide to keep the current searchParams behaviour. Given that searchParams modifies the input by design and the working group does not seem to want to change it, would it be preferable to avoid using it altogether?

sindresorhus commented 2 years ago

Given that searchParams modifies the input by design and the working group does not seem to want to change it, would it be preferable to avoid using it altogether?

There's no easy way to avoid it as we need to keep the sort support and I don't want to bring in another library just for that.

sindresorhus commented 2 years ago

You could potentially only decode when !options.sortQueryParameters

I guess this is the best option then.

ludofischer commented 2 years ago

You could potentially only decode when !options.sortQueryParameters

I guess this is the best option then.

I've pushed an additional commit to check if the parameters have been sorted before decoding.

psamusev commented 2 years ago

The current behavior of urlObject.pathname = decodeURI(urlObject.pathname); doesn't work as expected. Since URL automatically always encode any value that we set, according to https://developer.mozilla.org/en-US/docs/Web/API/URL

URLs are encoded according to the rules found in [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986)

sindresorhus commented 2 years ago

@psamusev It's better to open a new issue instead of commenting on an old pull request. Submitting a failing test would be even better.