Closed ivan-tymoshenko closed 2 years ago
It appears encodeURI
is encoding the []
but the URL standard does not do that. Try instead:
const inputPath = new URL('http://t.t/[]%23');
It's one of the reserved characters and should be encoded before sending. https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
URLPattern operates on URLs, not URIs. URLs only percent encode a few codepoints in the path:
https://url.spec.whatwg.org/#path-percent-encode-set
You can test this out on the live URL viewer here:
https://jsdom.github.io/whatwg-url/#url=aHR0cDovL3QudC9bXQ==&base=YWJvdXQ6Ymxhbms=
Is there a way how I can combine the real world that sends me encoded URI with URLPattern?
Encoded and decoded URIs should be equal. (https://datatracker.ietf.org/doc/html/rfc2616#section-3.2.3) Is it not true for URLs?
Is there a way how I can combine the real world that sends me encoded URI with URLPattern?
I need more context about the use case.
Encoded and decoded URIs should be equal. (https://datatracker.ietf.org/doc/html/rfc2616#section-3.2.3) Is it not true for URLs?
URL and URLPattern don't do any automatic decoding. Not sure if that is what you are asking or not.
Yes, they don't decode URLs. They "canonicalize" URLs.
new URL('%7E', 'http://h.d/').pathname === '/~'
I understand that the problem is a little bit out of scope URLPattern implementation. In practice when we receive a request URI, we need to decode it before matching. It seems that there is no correct way to decode URI with encoded parameters to match the URI.
If I receive an encoded URI, do you know what should I do with it before calling URLPattern.exec function? I mean if URLPattern doesn't do encoding/decoding, then I should do it by myself. I'm asking how I should do it.
You're example is a bug in chrome and not interoperable across browsers. Per the URL spec the pathname should remain /%7E
:
https://jsdom.github.io/whatwg-url/#url=JTdF&base=aHR0cDovL2guZC8=
(Note, both firefox and safari correctly produce a pathname of /%7E
.)
Can you not call decodeURI()
on the input prior to passing the value to URLPattern.exec()
?
First of all, thanks very much for your help.
1) This is an RFC that describes how HTTP works (https://datatracker.ietf.org/doc/html/rfc2616#section-3.2.3). And it says that /%7E
is a current path and equals /~
. Chrome doesn't support all valid HTTP URIs? I understand that URL is a subset of URI. But when we touch on the practice it becomes a little vague for me.
2) If it's just a static path, then yes the decodeURI()
works. But URLPattern
supports params in the pathname. These params should be decoded by decodeURIComponent()
and not decoded by thedecodeURI()
. And here we have a circle: I should decode a URL before matching and to decode it correctly I should know where the params are (I can know it only after matching).
Example:
patter = /~:param
input url = /%7E%2523
/%7E
- should be decoded by decodeURI
function to /~
%2523
- should be decoded by decodeURIComponent
to %23
And why when I put http://ABC.com/%7Esmith/home.html
to the chrome it converts it to the http://ABC.com/~smith/home.html
. Sorry if it's a dumb question.
Using your example, this just seems to work for me:
const pattern = new URLPattern({pathname: '/~:param'});
const encoded = '/%7E%2523';
const decoded = decodeURI(encoded);
const result = pattern.exec({pathname: decoded});
result.pathname.groups.param === '%23';
You can of course re-encode the end result if you want it in that form. I'm not sure I quite understand.
And why when I put http://ABC.com/%7Esmith/home.html to the chrome it converts it to the http://ABC.com/~smith/home.html.
Browser URL bars can do extra decoding that APIs like URL() and URLPattern() do not do. Also, chrome is not conformant at the API layer with other browsers.
Using your example, this just seems to work for me:
const pattern = new URLPattern({pathname: '/~:param'}); const encoded = '/%7E%2523'; const decoded = decodeURI(encoded); const result = pattern.exec({pathname: decoded}); result.pathname.groups.param === '%23';
You can of course re-encode the end result if you want it in that form. I'm not sure I quite understand.
You skipped the decodeURIComponent step at the end.
from the /%7E%2523
i want to get param %23
from the /%7E%23
i want to get param #
/%7E%23
=> decodeURI('/%7E%23')
== /~%23
=> URLPattern.exec('/~%23').param
== %23
=> decodeURIComponent('%23')
== #
, that is correct
/%7E%2523
=> decodeURI('/%7E%2523')
== /~%23
=> URLPattern.exec('/~%23').param
== %23
=> decodeURIComponent('%23')
== #
, but should be %23
(we double decode it)
And it seems like I can fetch an unsupported URL.
await fetch('https://jsdom.github.io/whatwg-ur%6c/')
@wanderview @kenchris @ivan-tymoshenko From the discussion above I can't decide if this is working according to specs, or if there is some bug that needs fixing. If it is working according to spec, we should close this issue. Otherwise, we should distill some actionable issues out of this.
I think its working per spec and there is nothing actionable here.
That was the point i was gravitating to also. I will close the issue. If it turns out the is some actionable issue in here, please open up a new issue referring this one
Hi, I have a question about how should I work with encoded path parameters. There is one example that has all troublespots that I have.
I want to match the pattern url, and get param equal
%23
.