sindresorhus / query-string

Parse and stringify URL query strings
MIT License
6.76k stars 450 forks source link

fix: Encode fragmentIdentifier with encodeURI method not encodeURIComponent at stringifyUrl #355

Closed hanbin9775 closed 1 year ago

hanbin9775 commented 1 year ago

Fixes: #346

FragmentIdentifier should be encoded with encodeURI method not encodeURIComponent method. Also added test code about this fix.

sindresorhus commented 1 year ago

Are you sure it's correct to use encodeURI? AFAIK, it's meant for complete URLs. Can you link to a source about it being correct to use in this situation?

hanbin9775 commented 1 year ago

I did some investigation, and found out something wrong about my PR. Thank you for the feedback!

URL hash setter in Node.js

Node.js implements url.hash setter with this following.

Invalid URL characters included in the value assigned to the hash property are percent-encoded.

Url fragments are percent encoded with following percent-encode set. (As known as fragment percent-encode set)

C0 control percent-encode set and code points U+0020, U+0022, U+003C, U+003E, and U+0060.

C0 control percent-encode set is

code points in range U+0000 to U+001F (inclusive) and all code points greater than U+007E.

You can find list of code points in here.

This is also described in URL Standard. (https://url.spec.whatwg.org/#fragment-state)

encodeURI, encodeURIComponent

The only difference with encodeURI and encodeURIComponent is whether the following character set is encoded: ;/?:@&=+$,# (https://tc39.es/ecma262/multipage/global-object.html#sec-encode)

Both methods encodes characters which is not the ASCII word characters and "-.!~*'()".

Conclusion: difference with encodeURI/encodeURIComponent and url.hash setter

Found out percent-encode sets are different. If we express the character set that will display different encoding result, it will be as follows

For example code point U+005B ( [ ) is not included on fragment percent-encode set but url.hash setter will percent-encode it.

const fragement = '[';
const x = new URL(url);
x.hash = fragment;

console.log(encodeURI(fragment)); // '%5B'
console.log(encodeURIComponent(fragment)); // '%5B'

console.log(url.hash); // '['

So I believe the easiest implement for fixing #355 issue will be like this.

as-is

if (object.fragmentIdentifier) {
    hash = `#${options[encodeFragmentIdentifier] ? encode(object.fragmentIdentifier, options) : object.fragmentIdentifier}`;
}

to-be

if (object.fragmentIdentifier) {
    const newURL = new URL(url);
    newURL.hash = object.fragmentIdentifier;
    hash = options[encodeFragmentIdentifier] ? newURL.hash : `#${object.fragmentIdentifier}`;
}

Or we could create a new function for encoding url fragment. encodeURIFragment