kenchris / urlpattern-polyfill

URLPattern polyfill
https://www.npmjs.com/package/urlpattern-polyfill
MIT License
268 stars 31 forks source link

Applying defaults with a base URL #26

Closed 43081j closed 2 years ago

43081j commented 3 years ago

https://github.com/kenchris/urlpattern-polyfill/blob/3b7dde94e60b9353bd54328c8b61078175a2c816/src/url-pattern.ts#L68-L73

I'm not sure if i misunderstood or this is a bug/awkwardness, but essentially:

(new URLPattern({pathname: '/foo'})).pattern;
// produces:
{
  "hash": "*",
  "hostname": "*",
  "password": "*",
  "pathname": "/foo",
  "port": "*",
  "protocol": "*",
  "search": "*",
  "username": "*"
}

(new URLPattern({pathname: '/foo', baseURL})).pattern;
// produces:
{
  "hash": "",
  "hostname": "",
  "password": "",
  "pathname": "/foo",
  "port": "",
  "protocol": "",
  "search": "",
  "username": ""
}

this means the computed regex patterns become ^$ in the latter, rather than ^(.*)$.

so we can't rely on defaults if we have a base URL? the comment i quoted seems to hint at the fact that its on purpose, but it feels like the behaviour should be the same to me

wanderview commented 3 years ago

URL parsing baseURL produces a specific value for every component. Writing https://example.com/foo is the same as writing https://example.com/foo?#. If you use a base URL or the string argument to the constructor then you must explicitly provide wildcards.

What I don't understand in your example, though, is how you can have a base URL with empty string for all components. That should fail to parse.

Also, the URLPattern.pattern property is an internal private field. You should instead just reference the URLPattern.hash, etc, getters instead.

wanderview commented 3 years ago

I can't reproduce your example:

const p = new URLPattern({ 'pathname': '/foo', baseURL: 'https://example.com' });
console.log(p.hostname);

Logs example.com for me.

43081j commented 3 years ago

ah sorry it was just a terrible example. the host etc will have been populated.

https://jsfiddle.net/4u76p0sv/

Just like in my example further up, you can see the pattern is * when initialised without a base URL, and "" when initialised with a base URL.

So it seems you are correct and we need to set every part to * manually if we have a base URL (when i say every, i mean the ones we care about, of course we'd take the non-empty ones from the base URL).

wanderview commented 3 years ago

Yes, that is intentional.

Early versions of URLPattern did not bring in the base URL values for search or hash, but I ran into problems with that when we added the constructor string behavior. Consider:

new URLPattern('#foo', 'https://example.com/?q=1');

Since the pattern is relative starting at the hash, we have to carry the search value over from the base URL.

Hopefully this is not too onerous to deal with.

43081j commented 3 years ago

that makes sense, but i do wonder if the inconsistency will confuse more than just me.

in your example, you essentially have:

{
  "hostname": "example.com",
  "hash": "foo",
  "password": "",
  "pathname": "/",
  "port": "",
  "protocol": "https",
  "search": "q=1"
}

the parts we care about have explicit values. the empty ones are parts we haven't defined.

the inconsistency comes when you do/don't have a base URL:

new URLPattern({ baseURL: 'https://example.com/' }); // everything is an exact match of empty string
new URLPattern({ }); // everything is a wildcard

maybe im an outlier but i would assume (and did assume) that the base URL provides defaults for the parts it contains, leaving the rest as URLPattern's defaults (wildcard).

your example would work exactly as it does now, if the default was wildcard rather than empty string. because everything you care about in that situation has an explicit value.

anyhow this is more of a question to wherever the draft repo lives more than the polyfill i suppose. i am happy to pass the wildcards myself but if that stays the behaviour, its probably worthwhile documenting it very clearly if it isn't already

wanderview commented 3 years ago

If we coerced empty string to * then there would be no way to write "please match where <component> is empty". For example, if you want to exactly match the default port or exactly match that there is no query string.

Ultimately empty string and undefined are different values. We map undefined to *, but we must support matching empty string as its a valid constraint.

wanderview commented 3 years ago

By the way, this discussion might be better as a https://github.com/WICG/urlpattern spec issue.

wanderview commented 3 years ago

@kenchris I think this can probably be closed unless you want to update some documentation within this repo.