w3c / ServiceWorker

Service Workers
https://w3c.github.io/ServiceWorker/
Other
3.63k stars 314 forks source link

scope pattern matching #1468

Open wanderview opened 4 years ago

wanderview commented 4 years ago

I'd like to propose some improvements to how service worker scopes match against client URLs. For example, the proposal would allow sites to match exact URLs instead of always using substring matching.

    // Service worker controlling '/' exactly.
    navigator.serviceWorker.register('/root-sw.js', {
        scope: new URLPattern({
            baseUrl: self.location,
            path: '/'
        })
    });

    // Product hosted at '/sub/path/product' is not controlled.

The full explainer is a bit long, so I have published it in a separate repo for now:

https://github.com/wanderview/service-worker-scope-pattern-matching/blob/master/explainer.md

I'm hoping to discuss this at the face-to-face meeting at TPAC 2019.

This proposal is related to other issues such as #1272 and #287, but I filed this as a separate issue to have a clean discussion of this proposal. This proposal is orthogonal to declarative routing (since that takes place after scope matching.)

asakusuma commented 4 years ago

Is there a way to make an entire segment optional? For instance, what if the same team that manages the root url also manages a couple of the sub path. Like path: /[home]?* would match both / AND /home.

Or would that behavior need to wait for multiple scopes?

wanderview commented 4 years ago

We had a lively 1.5 hour discussion on this at TPAC. The notes are here:

https://docs.google.com/document/d/1q090ovJ4gd8wSfVtvuoZLMZ51YkiFDsEZ0Jiqi41Iys/edit#heading=h.d6e4ecicreet

My main take aways:

wanderview commented 3 years ago

I've been thinking a lot about this recently and have written down a detailed design for chromium here:

https://docs.google.com/document/d/17L6b3zlTHtyxQvOAvbK55gQOi5rrJLERwjt_sKXpzqc/edit#heading=h.7nki9mck5t64

From a spec proposal perspective this makes a couple largish changes:

  1. Aligns pattern syntax on the popular path-to-regexp library.
  2. Allows a scope to be an include-list of patterns.

I have also updated the explainer:

https://github.com/wanderview/service-worker-scope-pattern-matching/blob/master/explainer.md

annevk commented 3 years ago

As it's very much focused on paths, it's still not really clear to me what things like a base URL buy us here.

wanderview commented 3 years ago

I'm not sure what you mean by "very much focused on paths". You can set values/patterns for scheme, hostname, etc. Indeed, service worker scope patterns must have a non-variable value set for all parts of the origin.

In addition, it seems supporting relative behavior is important? If we don't support that then it becomes very difficult to build a site that can be hosted in different locations without requiring modifications. Supporting relative behavior seems like a core feature of the web and I'm hesitant to drop it. (Its also supported by legacy SW scopes, so would be a backward compat issue.)

The baseURL concept seemed like a natural fit for these things to me, but happy to discuss alternative API shapes.

youennf commented 3 years ago

Latest proposal is becoming a full-fledged URL matcher. This seems a nice addition to the platform, for instance for fetch event handlers. I am wondering whether static routes could use it.

I'd like to better understand the benefits of reusing the URL matcher for service worker matching though, say compared to a dedicated API. In particular, the envisioned extensions to the service worker matching algorithm seems fairly limited, and for good reasons. As a developer, I would probably not like to look at the API, start thinking about all the potential uses of passing an URL matcher, before actually realising that I can only use a very restricted set of URL matchers.

annevk commented 3 years ago

I think there is something to that argument, but I wonder if we could still reuse the syntax and underlying model. And just have a different API entry point for them focused exclusively on service worker scopes.

wanderview commented 3 years ago

I guess I don't understand the concern. There are already limitations on what you can do with a service worker scope that are not obvious from the register() method. For example, we reject if the scope and script location are not in the proper relation to each other. I think the proposed restrictions for pattern scopes are much easier to understand than that. No matter what developers have to read the API documentation beyond just looking at the signature of the register() method.

Also I think there is a very strong ecosystem case to moving the web platform to a shared syntax that also is intuitive to developers. Having a shared primitive moves us in that direction, at least for new API surfaces. @kenchris, did you discuss this sort of thing in the TAG review at all?

In any case, @youennf and @annevk it would be great if you could provide a more concrete counter proposal of what you think would be better. In the abstract its very hard for me to understand what end state you would prefer.

wanderview commented 3 years ago

Note, the repo has now moved to https://github.com/WICG/urlpattern.

annevk commented 3 years ago

E.g., you'd give register()'s second argument a scopePattern(s?) member that takes a list of (string) path patterns. No need for a base URL as it's irrelevant and no need for URLPattern or URLPatternList there as you don't need their expressiveness. The syntax would still be consistent with URLPattern though.

wanderview commented 3 years ago

Ok. If its consistent with URLPattern, though, how does that improve the concern about developer expectations?

FWIW, after your comment at the meeting I am leaning towards exposing a URLPathnamePattern that only deals with pathname matching that service workers could use instead of the full URLPattern. (I haven't gotten to writing up that action item as an issue yet.)

I guess I don't have a strong opinion if service workers takes a string and internally create a URLPathnamePattern or if the developer passes one in. I guess ideally it would be nice to accept either.

annevk commented 3 years ago

I think the different API endpoint that only deals with a subset makes it quite clear to developers that it's not a URLPattern. It might not help with differences in path syntax though, but at least there is no suggestion your scope could encompass other origins.

youennf commented 3 years ago

A list of strings would look better to me than passing a URLPatternList, especially if URLPattern can do origin matching and so on. Reusing algorithms defined by the urlpattern spec in the service worker spec seems like a good idea.

Note sure about URLPathnamePattern, would need to look at it.

how does that improve the concern about developer expectations?

There is higher expectation with more focused types than base types like strings.

wanderview commented 3 years ago

There is higher expectation with more focused types than base types like strings.

I don't agree with this if we want developers to expect the strings to exactly match what the formal type takes. It seems we could easily make the API take either the string or the object. Its just syntactic sugar at that point.

Anyway, once I write my thinking on the path specific change I'll come back and link here.

wanderview commented 3 years ago

I've written up more about URLPathnamePattern at https://github.com/WICG/urlpattern/issues/20.

wanderview commented 3 years ago

Just curious, is the desire to pass strings instead of objects based on the same principle for APIs taking strings instead of URL objects? https://url.spec.whatwg.org/#url-apis-elsewhere

Personally I've never understood this as it seems to promote inefficiency in APIs. If the javascript already created a URL object, but can't pass it then we end up having to parse and validate the URL twice. I've never understood why APIs don't take string or URL.

Anyway, if its the same principle then I am fine switching to strings as long as the pattern syntax remains the same. I would have a preference for taking string or object, though.

annevk commented 3 years ago

I think there might be something to say for that if URL objects were immutable. But they are not and they also stringify so it should not be a problem in practice for web developers (although making that parse once is likely more trouble than it's worth for implementers).

For me the main thing here is that given the intended use for service worker scopes, a URLPattern object does too much and there is no future where it will no longer do too much. I think that strongly argues for exposing the necessary subset directly.

Now, whether that subset needs to be wrapped in an object or not, I think I would not make people invoke a constructor first if a string suffices.

(I think @youennf's point is more that if something takes an object you kinda expect it to handle all/most flavors of that object whereas if something takes a string you expect some custom microsyntax you might have to look up.)

wanderview commented 3 years ago

Alright, I'll plan for a pathname string for now. I might make it optionally take a URLPathnamePattern to see what people think, but I don't feel too strongly about it. Thanks for the feedback and explaining.