whatwg / urlpattern

URL Pattern Standard
https://urlpattern.spec.whatwg.org/
Other
154 stars 20 forks source link

Add a section on other specs integrating with URLPattern #199

Closed jeremyroman closed 9 months ago

jeremyroman commented 9 months ago

A section which covers how other specifications should use URLPattern and how developer-facing APIs should work is added, along with helpful algorithms.

One of these is whether a pattern has regexp groups (which may require an ECMAScript regexp engine); a corresponding WebIDL attribute is added to expose this property to authors as well.

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

jeremyroman commented 9 months ago

This relates to whatwg/urlpattern#182 and whatwg/urlpattern#191.

jeremyroman commented 9 months ago

Found a few more nits, but it's looking quite nice to me.

It might be worth adding (either here or in a followup PR) a section on using URL patterns in HTTP fields, since that's another use case that's coming up in practice. Initial thoughts:

  • Encourage just using strings for now, and not necessarily accepting dictionaries
  • The algorithm can be a wrapper around "build a URLPattern from an Infra value" that is only ever called with a string
  • Give them some help on figuring out exactly which base URL to use. Maybe consider both the request and response header cases?
  • Maybe omit the realm argument and just use "an implementation-defined realm", because it's quite silly to make HTTP specs deal with realms? We can still have an XXX box, but I'd hate to have the realm as part of the public interface for this particular call site. (Related discussion regarding "implementation-defined realm": "Parse JSON to Infra" algorithms shouldn't require a current JS realm infra#625)

Yeah, I thought about including HTTP fields in this PR, but wasn't sure if we'd hammered out enough of the details yet.

At surface I would have thought the constructor string syntax would have been the most natural fit, but from what I understand of WICG/compression-dictionary-transport#52 they're currently looking at separate match-path and match-search (or match-query) params (contrary to your proposal here), and it's not clear to me if their reasons for doing so generalize. Like, here, it's somewhat pointless to match on things like protocol and port, but that's not necessarily the case for most hypothetical future header fields.

Besides that one I don't have a clear enough concept of the likely uses to be confident writing guidance, and would probably rather they file an issue or otherwise bring it up to us for now.

domenic commented 9 months ago

At surface I would have thought the constructor string syntax would have been the most natural fit, but from what I understand of WICG/compression-dictionary-transport#52 they're currently looking at separate match-path and match-search (or match-query) params (contrary to your proposal here), and it's not clear to me if their reasons for doing so generalize. Like, here, it's somewhat pointless to match on things like protocol and port, but that's not necessarily the case for most hypothetical future header fields.

Hmm, we should discuss with them what's going on. This seems contrary to the design of URL patterns...

jeremyroman commented 9 months ago

While I'm here, want to note -- the ignoreCase option has the potential to be an obstacle to the cases which can't handle regexp groups, but I think it's actually fine because I think the match algorithm ends up making everything ASCII. Otherwise, the fact that vi regexps do Unicode case folding (so k could match not only uppercase K but also kelvin symbol K , for instance) would be a potential issue for implementations that implemented matching without a regexp engine.

jeremyroman commented 9 months ago

For multi-implementer interest, @annevk is WebKit cool with this? You seemed supportive of at least whatwg/urlpattern#191 (which is the real behavior change here right now, since nothing depends on the whatwg/urlpattern#182 aspects as yet).

yoshisatoyanagisawa commented 9 months ago

Thanks for looping me in, @domenic. LGTM. I left some minor comments.

domenic commented 9 months ago

Given the comments like

Thanks. I'll have to wait for the PR to get resolved before it is something I could reference but as it looks right now, URLPatternCompatible is specific to the Javascript API.

over in https://github.com/WICG/compression-dictionary-transport/issues/52 it seems like it'd be good to add a section on HTTP header usage, either in this PR or as a near-future followup.

jeremyroman commented 9 months ago

Given the comments like

Thanks. I'll have to wait for the PR to get resolved before it is something I could reference but as it looks right now, URLPatternCompatible is specific to the Javascript API.

over in WICG/compression-dictionary-transport#52 it seems like it'd be good to add a section on HTTP header usage, either in this PR or as a near-future followup.

I agree it would be good to cover once we're agreed on what to do and feel fairly confident it will generalize. I was hoping to avoid entangling these two, so doing HTTP header fields in a followup. It feels slightly silly to write our opinions here and then turn around and use ourselves as authority to back up our own position, but if that's helpful we can do that too.

annevk commented 9 months ago

@jeremyroman I don't understand the references to Deno in https://github.com/whatwg/urlpattern/pull/199#issuecomment-1826056297.

WebKit is supportive of figuring out an interface to URLPattern so it can be reused in environments where regular expression engines are a no-no.

I'm not entirely sure that the JavaScript API needs to make the difference observable (as that won't be accessible in those environments), but I guess it's reasonable.

jeremyroman commented 9 months ago

@jeremyroman I don't understand the references to Deno in #199 (comment).

Those were supposed to be links to issues in this repository; for some reason Github autolinked to the wrong repository. (???)

WebKit is supportive of figuring out an interface to URLPattern so it can be reused in environments where regular expression engines are a no-no.

I'm not entirely sure that the JavaScript API needs to make the difference observable (as that won't be accessible in those environments), but I guess it's reasonable.

One reason might be registering service worker static routes, where the pattern is registered from a JavaScript environment but evaluated in the fetch handling path.

annevk commented 9 months ago

Makes sense, yeah. I do think we need a string-only version (realm-free) of the part you're not explicitly seeking feedback on yet, but we can get there through iteration.

jeremyroman commented 9 months ago

@yoshisatoyanagisawa @domenic still lgty both? I'll merge if so, as I think we've met the process requirements.

domenic commented 9 months ago

Still LGTM except I don't think you saw this comment:

You should also export the "match" definition https://urlpattern.spec.whatwg.org/#match

yoshisatoyanagisawa commented 9 months ago

Thanks for heads up. Still lgtm.

jeremyroman commented 9 months ago

Still LGTM except I don't think you saw this comment:

You should also export the "match" definition https://urlpattern.spec.whatwg.org/#match

Whoops, super easy to miss unfortunately because of how GitHub requires me to switch to Files to be able to start a review (and not send tons of individual emails), which hides comments not attached to lines. Will fix.