w3c / ServiceWorker

Service Workers

https://w3c.github.io/ServiceWorker/

Other

3.63k stars 313 forks source link

Declarative routing #1373

Open jakearchibald opened 5 years ago

jakearchibald commented 5 years ago

Here are the requirements I'm working towards:

Be able to bypass the service worker for particular requests.
Speed up simple offline-first, online-first routes by avoiding service worker startup time.
Be polyfillable – do not introduce things that cannot already be done in a fetch event.
Be extensible – consider what future additions to the API might look like.
Avoid state on the registration if possible – prefer state on the service worker itself.

I'm going to start with static routes, and provide additional ideas in follow-up posts.

The aim is to allow the developer to declaratively express a series of steps the browser should perform in attempt to get a response.

The rest of this post is superseded by the second draft

Creating a route

WebIDL

// Install currently uses a plain ExtendableEvent, so we'd need something specific
partial interface ServiceWorkerInstallEvent {
  attribute ServiceWorkerRouter router;
}

[Exposed=ServiceWorker]
interface ServiceWorkerRouter {
  void add(ServiceWorkerRouterItem... items);
}

[Exposed=ServiceWorker]
interface ServiceWorkerRouterItem {}

JavaScript

addEventListener('install', (event) => {
  event.router.add(...items);
  event.router.add(...otherItems);
});

The browser will consider routes in the order declared, and will consider route items in the order they're given.

Route items

Route items fall into two categories:

Conditions – These determine if additional items should be considered.
Sources – A place to attempt to get a response from.

Sources

WebIDL

[Exposed=ServiceWorker, Constructor(optional RouterSourceNetworkOptions options)]
interface RouterSourceNetwork : ServiceWorkerRouterItem {}

dictionary RouterSourceNetworkOptions {
  // A specific request can be provided, otherwise the current request is used.
  Request request;
}

[Exposed=ServiceWorker, Constructor(optional RouterSourceCacheOptions options)]
interface RouterSourceCache : ServiceWorkerRouterItem {}

RouterSourceCacheOptions : MultiCacheQueryOptions {
  // A specific request can be provided, otherwise the current request is used.
  Request request;
}

[Exposed=ServiceWorker, Constructor(optional RouterSourceFetchEventOptions options)]
interface RouterSourceFetchEvent : ServiceWorkerRouterItem {}

dictionary RouterSourceFetchEventOptions {
  DOMString id = '';
}

These interfaces don't currently have attributes, but they could have attributes that reflect the options/defaults passed into the constructor.

Conditions

WebIDL

[Exposed=ServiceWorker, Constructor(ByteString method)]
interface RouterIfMethod : ServiceWorkerRouterItem {}

[Exposed=ServiceWorker, Constructor(USVString url, optional RouterIfURLOptions options)]
interface RouterIfURL : ServiceWorkerRouterItem {}

dictionary RouterIfURLOptions {
  boolean ignoreSearch = false;
}

[Exposed=ServiceWorker, Constructor(USVString url)]
interface RouterIfURLPrefix : ServiceWorkerRouterItem {}

[Exposed=ServiceWorker, Constructor(USVString url, optional RouterIfURLOptions options)]
interface RouterIfURLSuffix : ServiceWorkerRouterItem {}

[Exposed=ServiceWorker, Constructor(optional RouterIfDateOptions options)]
interface RouterIfDate : ServiceWorkerRouterItem {}

dictionary RouterIfDateOptions {
  // These should accept Date objects too, but I'm not sure how to do that in WebIDL.
  unsigned long long from = 0;
  // I think Infinity is an invalid value here, but you get the point.
  unsigned long long to = Infinity;
}

[Exposed=ServiceWorker, Constructor(optional RouterIfRequestOptions options)]
interface RouterIfRequest : ServiceWorkerRouterItem {}

dictionary RouterIfRequestOptions {
  RequestDestination destination;
  RequestMode mode;
  RequestCredentials credentials;
  RequestCache cache;
  RequestRedirect redirect;
}

Again, these interfaces don't have attributes, but they could reflect the options/defaults passed into the constructor.

Shortcuts

GET requests are the most common type of request to provide specific routing for.

WebIDL

partial interface ServiceWorkerRouter {
  void get(ServiceWorkerRouterItem... items);
}

Where the JavaScript implementation is roughly:

router.get = function(...items) {
  router.add(new RouterIfMethod('GET'), ...items);
};

We may also consider treating strings as URL matchers.

router.add('/foo/') === router.add(new RouterIfURL('/foo/')).
router.add('/foo/*') === router.add(new RouterIfURLPrefix('/foo/')).
router.add('*.png') === router.add(new RouterIfURLSuffix('.png')).

Examples

Bypassing the service worker for particular resources

JavaScript

// Go straight to the network after 25 hrs.
router.add(
  new RouterIfDate({ from: Date.now() + 1000 * 60 * 60 * 25 }),
  new RouterSourceNetwork(),
);

// Go straight to the network for all same-origin URLs starting '/videos/'.
router.add(
  new RouterIfURLPrefix('/videos/'),
  new RouterSourceNetwork(),
);

Offline-first

JavaScript

router.get(
  // If the URL is same-origin and starts '/avatars/'.
  new RouterIfURLPrefix('/avatars/'),
  // Try to get a match for the request from the cache.
  new RouterSourceCache(),
  // Otherwise, try to fetch the request from the network.
  new RouterSourceNetwork(),
  // Otherwise, try to get a match for the request from the cache for '/avatars/fallback.png'.
  new RouterSourceCache({ request: '/avatars/fallback.png' }),
);

Online-first

JavaScript

router.get(
  // If the URL is same-origin and starts '/articles/'.
  new RouterIfURLPrefix('/articles/'),
  // Try to fetch the request from the network.
  new RouterSourceNetwork(),
  // Otherwise, try to match the request in the cache.
  new RouterSourceCache(),
  // Otherwise, if the request destination is 'document'.
  new RouterIfRequest({ destination: 'document' }),
  // Try to match '/articles/offline' in the cache.
  new RouterSourceCache({ request: '/articles/offline' }),
);

Processing

This is very rough prose, but hopefully it explains the order of things.

A service worker has routes. The routes do not belong to the registration, so a new empty service worker will have no defined routes, even if the previous service worker defined many.

A route has items.

To create a new route containing items

If the service worker is not "installing", throw. Routes must be created before the service worker has installed.
Create a new route with items, and append it to routes.

Handling a fetch

These steps will come before handling navigation preload, meaning no preload will be made if a route handles the request.

request is the request being made.

Let routerCallbackId be the empty string.
RouterLoop: For each route of this service worker's routes:
1. For each item of route's items:
  1. If item is a RouterIfMethod, then:
    1. If item's method does not equal request's method, then break.
  2. Otherwise, if item is a RouterIfURL, then:
    1. If item's url does not equal request's url, then break.
  3. Etc etc for other conditions.
  4. Otherwise, if item is a RouterSourceNetwork, then:
    1. Let networkRequest be item's request.
    2. If networkRequest is null, then set networkRequest to request.
    3. Let response be the result of fetching networkRequest.
    4. If response is not an error, return response.
  5. Otherwise, if item is a RouterSourceCache, then:
    1. Let networkRequest be item's request.
    2. If networkRequest is null, then set networkRequest to request.
    3. Let response be the result of looking for a match in the cache, passing in item's options.
    4. If response is not null, return response.
  6. Otherwise, if item is a RouterSourceFetchEvent, then:
    1. Set routerCallbackId to item's id.
    2. Break RouterLoop.
Call the fetch event as usual, but with routerCallbackId as one of the event properties.

Extensibility

I can imagine things like:

RouterOr(...conditionalItems) – True if any of the conditional items are true.
RouterNot(condition) – Inverts a condition.
RouterIfResponse(options) – Right now, a response is returned immediately once one is found. However, the route could continue, skipping sources, but processing conditions. This condition could check the response and break the route if it doesn't match. Along with a way to discard any selected response, you could discard responses that didn't have an ok status.
RouterCacheResponse(cacheName) – If a response has been found, add it to a cache.
RouterCloneRequest() – It feels like RouterSourceNetwork would consume requests, so if you need to do additional processing, this could clone the request.

But these could arrive much later. Some of the things in the main proposal may also be considered "v2".

jakearchibald commented 5 years ago

Here's an alternative model suggested by @wanderview, which is focused on allowing developers to toggle the fetch event:

partial interface ServiceWorkerGlobalScope {
  attribute ServiceWorkerEventSubscriptions eventSubscriptions;
}

interface ServiceWorkerEventSubscriptions {
  Array<DOMString> get();
  void add(DOMString eventType);
  void remove(DOMString eventType);
}

Where get returns the set of event types to handle.

add adds to the set.

remove removes from the set.

This means the developer would be able to expire their fetch handler after some amount of time (I believe this is Facebook's use-case).

const deployTimestamp = 1543938484103;
const oneDay = 1000 * 60 * 60 * 24;

addEventListener('fetch', (event) => {
  if (Date.now() - deployTimestamp > oneDay) {
    eventSubscriptions.remove('fetch');
    return;
  }

  // …
});

This is a much simpler feature, but it doesn't allow skipping the service worker by route, or going straight to the cache by route.

jakearchibald commented 5 years ago

One more suggestion (again from @wanderview) would be to add a new CSP rule (or something similar) that defines which url-prefixes a service worker would intercept, if any.

This means the setting would be page-by-page. Navigations wouldn't be able to opt-out of service worker, but presumably navigation preload would solve a lot of performance issues there.

jakearchibald commented 5 years ago

cc @n8schloss, @jatindersmann, @aliams, @youennf, @asutherland

jakearchibald commented 5 years ago

There's a gotcha with the static routes: If you put them at top level, they'll execute without issue, but next time the service worker is started (after activating), they'll fail. Also, if the service worker starts multiple times before activation, you'll get duplicate routes.

Might make more sense to throw unless the routes are being added during the install event, so they'd always fail top-level.

Given that, it might make sense to put the API on the install event. Edit: I've moved it to the install event.

annevk commented 5 years ago

How does this relate to https://github.com/domenic/import-maps? Intuitively it feels like these should be the same thing.

wanderview commented 5 years ago

Just to clarify, the two suggestions I had were to support a subset of the use cases where the site might want to stop consulting the fetch handler until an update can occur. Some sites are blocked on these use cases right now and I wondered if there was a way we could unblock them without conflicting with any future possible static routes spec.

jakearchibald commented 5 years ago

@annevk I think they're different enough, and can be used together.

Import maps URLs with scheme import: to one or more built-in modules or HTTP fetches. Those HTTP fetches then go through the service worker.

Then, the service worker's fetch event (or its static routes) can decide how to conduct those fetches.

jeffposnick commented 5 years ago

But these could arrive much later. Some of the things in the main proposal may also be considered "v2".

I can see developers eagerly adopting this approach if browsers started to implement it. Being able to string together complex routing rules with network/cache interactions without the overhead of starting up a service worker or pulling in extra runtime code (except for the polyfill scenario...) would be a nice performance win.

I have something of a meta-question about the longer term, v2+, plans though. There are some common concerns that production web apps need to think about, like cache maintenance/expiration policies for runtime caches. Today, that type of activity is likely to happen inside of a fetch handler. If we move to a model where you can either get

a) fast routing/response generation, but no fetch handler or b) the ability to run code to deal with housekeeping, but incur the overhead of a fetch handler

developers might feel conflicted about that tradeoff.

One approach could be to create a new type of event that's fired after a response has been generated by the native router, like routingcomplete, that allows developers to run their own bookkeeping code outside of the critical response generation flow.

Looking at it from a slightly different perspective, this proposal ends up creating a native implementation of a subset of things that have up until now been accomplished using service worker runtime libraries. Cache-expiration also falls into the category of things that developers commonly accomplish via service worker runtime libraries. The feeling I got from https://github.com/w3c/ServiceWorker/issues/863 is that there hasn't been an appetite for natively implementing that as part of the Cache Storage API. If this routing proposal moves forward, does that change the equation around natively implementing higher-level functionality like cache expiration?

jakearchibald commented 5 years ago

That's a good point. I had this in earlier drafts but didn't think we needed it yet. Should be relatively simple to add.

n8schloss commented 5 years ago

This is awesome! @jakearchibald, I really like the proposal you outlined outlined in the first comment, it solves the two big issues that we're seeing 😃

1) There's a good amount of added overhead we are measuring for fetching user generated non-cached resources when a fetch event is enabled.

2) After a period of time the items cached in the service worker and served via the fetch event are invalid, so when starting the service worker we end up skipping the cache and fetching from the network, however we end up paying a large cost of starting the service worker in that case and get no benefit.

@jakearchibald, @wanderview's solution that you outlined in the second comment above gives us a solution to issue 2 but not issue 1. Like @wanderview said in his comment, issue 2 impacts us more right now than issue 1. So with the first proposal, as long as the spec is written in such a way that vendors can quickly implement the experary time condition without having to block on implementing the RouterIfURL conditions then I think this is really really great!

jakearchibald commented 5 years ago

@n8schloss thanks for the feedback! It feels like IfURL is pretty fundamental to other developers, but the prefix/suffix stuff could probably wait.

@jatindersmann, @aliams, @youennf, @asutherland, @mattto, @wanderview: How do you feel about this implementation-wise?

aaronsn commented 5 years ago

@jakearchibald - This is really great! I like the flexibility of the proposal. I’m interested in understanding better if this will extend to more complex scenarios in the future.

Would it be possible to have something like RouterIfTimeout which activates that route but cancels if the headers aren't received within a certain timeout? This would allow for preferring the network but falling back to cache if the network takes too long.
Can this API handle racing sources rather than doing them in sequence?
I think RouterIfResponse is an important feature, to allow for custom handling for error responses. However, things start to get more complicated with options that apply after the request has been made. For example, say you want to prefer network if the response is a 200, then try cache, then use the server response anyway if cached response is missing. Could you accomplish that with add(new RouterSourceNetwork(), new RouterIfResponse()); add(new RouterSourceCache(), new RouterSourceNetwork()) without it issuing multiple network requests? Or, say you want to do the same thing but if either response isn’t a 200 then use custom logic to determine which to use. Would the fetch event be able to access the responses that were already requested via the routes?
Avoiding controlling certain sub-scopes is something I'd like to see for service workers, but I'm not sure if this API can/should provide this support. I can sort of see how you could do this with a route like add(new RouterIfURLPrefix(), new RouterSourceNetwork()), but you'd have to know which sub-resources will be requested. Could there be a RouterSource that causes the service worker to not control that client? Eg add(new RouterIfURLPrefix(), new RouterSourceUnclaimClient()). Or would that be too much of an abuse of the API?

Also one thought about implementation - if a request is handled via a static route, would the service worker still be started up in the background even though it isn't needed for the request? On the one hand, this would ensure that the service worker is started for future resource requests. On the other hand, if an entire page load (main resource plus sub resources) can be handled by static routes, then it's nice to avoid the performance cost of starting up the service worker.

jakearchibald commented 5 years ago

@aaronsn

Would it be possible to have something like RouterIfTimeout which activates that route but cancels if the headers aren't received within a certain timeout? This would allow for preferring the network but falling back to cache if the network takes too long.

I think I'd make this an option to RouterSourceNetwork, as it's the only one that would benefit from a timeout right now.

Can this API handle racing sources rather than doing them in sequence?

Something like:

router.get(
  new RouterSourceAny([
    ...sources
  ])
);

This could also have a timeout option.

I think RouterIfResponse is an important feature

I think everything you mention here is possible, although we might be hitting edge cases that are easier to read as logic in a fetch event, and uncommon enough that optimising them via a router doesn't bring much of a benefit.

Avoiding controlling certain sub-scopes is something I'd like to see for service workers, but I'm not sure if this API can/should provide this support

I don't think it should. Only things you can currently do in a fetch event are in scope. To do this, we either want a way to exclude routes as part of the call to serviceWorker.register, or a way to do it via a fetch event (after which we could look at adding something to the router).

if a request is handled via a static route, would the service worker still be started up in the background even though it isn't needed for the request? On the one hand, this would ensure that the service worker is started for future resource requests. On the other hand, if an entire page load (main resource plus sub resources) can be handled by static routes, then it's nice to avoid the performance cost of starting up the service worker.

Interesting! The spec is deliberately loose when it comes to when the service worker is started, and how long it stays alive for. Either behaviour would be spec compatible.

If a service worker is started, and not needed, it shouldn't affect page performance as nothing's blocked on it.

wanderview commented 5 years ago

It might also be useful to describe the default routes you get when a service worker is installed. Either route to FetchEvent or no where depending on if there is a fetch handler.

@wanderview: How do you feel about this implementation-wise?

Personal opinion that does not represent any actual implementation priorities:

I guess I'm most interested in how something like this could be incrementally approached. For example, if we started with:

ServiceWorkerRouter.add()
RouterIfDate
RouterSourceNetwork with default options

Or replace (2) with RouterIfURL. This minimal initial set might unblock certain use cases. We could then layer additional items later. I'm not sure how people would feel about a having a partial "router" in the platform that doesn't provide a full routing capability.

To me the RouterIfDate is more compelling at the moment because we don't have a good alternative solution for avoiding service worker startup costs in an expired state. RouterIfURL is more "router-like" but it seems oriented at carving out exceptions for certain subresources which feels like less of a problem since typically service worker startup is not necessary for subresources.

I imagine, though, there is going to be a tension between how many of these options and extensions to implement vs using javascript in FetchEvent. For example, the list of logical combinations in the "extensibility" section seemed like perhaps something that should just be done in js. I'm not sure we want to implement and maintain a complex DSL when we can achieve the same thing with js.

jakearchibald commented 5 years ago

One thing that isn't clear to me yet:

router.add(
  new RouterIfURLPrefix('/avatars/'),
  new RouterSourceCache(),
);

router.add(
  new RouterIfURLSuffix('.jpg'),
  new RouterSourceNetwork(),
);

If /avatars/foo.jpg is requested, but it isn't in the cache, what happens? Does the request fall through to the next route?

jakearchibald commented 5 years ago

Having slept on it, I think it's important that a single route is selected based on conditions. I'll work on a new draft that uses a router.add(conditions, sources) pattern.

This matches other routers like Express, where "continue to other routes" is opt-in.

jakearchibald commented 5 years ago

Ok, here's a second draft:

Creating a route

WebIDL

// Install currently uses a plain ExtendableEvent, so we'd need something specific
partial interface ServiceWorkerInstallEvent {
  attribute ServiceWorkerRouter router;
}

[Exposed=ServiceWorker]
interface ServiceWorkerRouter {
  void add(
    (RouterCondition or sequence<RouterCondition>) conditions,
    (RouterSource or sequence<RouterSource>) sources,
  );
}

[Exposed=ServiceWorker]
interface RouterSource {}

[Exposed=ServiceWorker]
interface RouterCondition {}

JavaScript

addEventListener('install', (event) => {
  event.router.add(conditions, sources);
  event.router.add(otherConditions, otherSources);
});

The browser will consider routes in the order declared, and if all conditions match, each source will be tried in turn.

Conditions

These determine if a particular static route should be used rather than dispatching a fetch event.

WebIDL

[Exposed=ServiceWorker, Constructor(ByteString method)]
interface RouterIfMethod : RouterCondition {}

[Exposed=ServiceWorker, Constructor(USVString url, optional RouterIfURLOptions options)]
interface RouterIfURL : RouterCondition {}

dictionary RouterIfURLOptions {
  boolean ignoreSearch = false;
}

[Exposed=ServiceWorker, Constructor(USVString url)]
interface RouterIfURLStarts : RouterCondition {}

[Exposed=ServiceWorker, Constructor(USVString url, optional RouterIfURLOptions options)]
interface RouterIfURLEnds : RouterCondition {}

[Exposed=ServiceWorker, Constructor(optional RouterIfDateOptions options)]
interface RouterIfDate : RouterCondition {}

dictionary RouterIfDateOptions {
  // These should accept Date objects too, but I'm not sure how to do that in WebIDL.
  unsigned long long from = 0;
  // I think Infinity is an invalid value here, but you get the point.
  unsigned long long to = Infinity;
}

[Exposed=ServiceWorker, Constructor(optional RouterIfRequestOptions options)]
interface RouterIfRequest : RouterCondition {}

dictionary RouterIfRequestOptions {
  RequestDestination destination;
  RequestMode mode;
  RequestCredentials credentials;
  RequestCache cache;
  RequestRedirect redirect;
}

Again, these interfaces don't have attributes, but they could reflect the options/defaults passed into the constructor.

Sources

These determine where the route should try to get a response from.

WebIDL

[Exposed=ServiceWorker, Constructor(optional RouterSourceNetworkOptions options)]
interface RouterSourceNetwork : RouterSource {}

dictionary RouterSourceNetworkOptions {
  // A specific request can be provided, otherwise the current request is used.
  Request request;
  // Reject responses that do not have an ok status.
  boolean requireOkStatus;
}

[Exposed=ServiceWorker, Constructor(optional RouterSourceCacheOptions options)]
interface RouterSourceCache : RouterSource {}

RouterSourceCacheOptions : MultiCacheQueryOptions {
  // A specific request can be provided, otherwise the current request is used.
  Request request;
}

[Exposed=ServiceWorker, Constructor(optional RouterSourceFetchEventOptions options)]
interface RouterSourceFetchEvent : RouterSource {}

dictionary RouterSourceFetchEventOptions {
  DOMString id = '';
}

These interfaces don't currently have attributes, but they could have attributes that reflect the options/defaults passed into the constructor.

Shortcuts

GET requests are the most common type of request to provide specific routing for.

WebIDL

partial interface ServiceWorkerRouter {
  void get(/* same as add */);
}

Where the JavaScript implementation is roughly:

router.get = function(conditions, sources) {
  if (conditions instanceof RouterCondition) {
    conditions = [conditions];
  }
  router.add([new RouterIfMethod('GET'), ...conditions], sources);
};

We may also consider treating strings as URL matchers.

router.add('/foo/', sources) === router.add(new RouterIfURL('/foo/'), sources).
router.add('/foo/*', sources) === router.add(new RouterIfURLStarts('/foo/'), sources).
router.add('*.png', sources) === router.add(new RouterIfURLEnds('.png'), sources).

Examples

Bypassing the service worker for particular resources

JavaScript

// Go straight to the network after 25 hrs.
router.add(
  new RouterIfDate({ from: Date.now() + 1000 * 60 * 60 * 25 }),
  new RouterSourceNetwork(),
);

// Go straight to the network for all same-origin URLs starting '/videos/'.
router.add(
  new RouterIfURLStarts('/videos/'),
  new RouterSourceNetwork(),
);

Offline-first

JavaScript

router.get(
  // If the URL is same-origin and starts '/avatars/'.
  new RouterIfURLStarts('/avatars/'),
  [
    // Try to get a match for the request from the cache.
    new RouterSourceCache(),
    // Otherwise, try to fetch the request from the network.
    new RouterSourceNetwork(),
    // Otherwise, try to get a match for the request from the cache for '/avatars/fallback.png'.
    new RouterSourceCache({ request: '/avatars/fallback.png' }),
  ],
);

Online-first

JavaScript

router.get(
  // If the URL is same-origin and starts '/articles/'.
  new RouterIfURLStarts('/articles/'),
  [
    // Try to fetch the request from the network.
    new RouterSourceNetwork(),
    // Otherwise, try to match the request in the cache.
    new RouterSourceCache(),
    // Otherwise, try to match '/articles/offline' in the cache.
    new RouterSourceCache({ request: '/articles/offline' }),
  ],
);

Processing

This is very rough prose, but hopefully it explains the order of things.

A service worker has routes. The routes do not belong to the registration, so a new empty service worker will have no defined routes, even if the previous service worker defined many.

A route has conditions and sources.

To create a new route containing conditions and sources

If the service worker is not "installing", throw. Routes must be created before the service worker has installed.
Create a new route with conditions and sources, and append it to routes.

Handling a fetch

These steps will come before handling navigation preload, meaning no preload will be made if a route handles the request.

request is the request being made.

RouterLoop: For each route of this service worker's routes:
1. For each condition of route's conditions:
  1. If condition is a RouterIfMethod, then:
    1. If condition's method does not equal request's method, then continue RouterLoop.
  2. Otherwise, if condition is a RouterIfURL, then:
    1. If condition's url does not equal request's url, then continue RouterLoop.
  3. Etc etc for other conditions.
2. For each source of route's sources:
  1. If source is a RouterSourceNetwork, then:
    1. Let networkRequest be source's request.
    2. If networkRequest is null, then set networkRequest to request.
    3. Let response be the result of fetching networkRequest.
    4. If response is not an error, return response.
  2. Otherwise, if source is a RouterSourceCache, then:
    1. Let networkRequest be source's request.
    2. If networkRequest is null, then set networkRequest to request.
    3. Let response be the result of looking for a match in the cache, passing in source's options.
    4. If response is not null, return response.
  3. Otherwise, if source is a RouterSourceFetchEvent, then:
    1. Set routerCallbackId to source's id.
    2. Call the fetch event as usual, but with source's id as one of the event properties.
    3. Return.
3. Return a network error.
Call the fetch event as usual.

Extensibility

I can imagine things like:

RouterOr(...conditionalItems) – True if any of the conditional items are true.
RouterNot(condition) – Inverts a condition.
RouterFilterResponse(options) – Right now, a response is returned immediately once one is found. However, the route could continue, skipping sources, but processing filters. This could check the response and discard it if it doesn't match. An example would be discarding responses that don't have an ok status.
RouterCacheResponse(cacheName) – If a response has been found, add it to a cache.

But these could arrive much later. Some of the things in the main proposal may also be considered "v2".

jeffposnick commented 5 years ago

Here's a couple of things that, based on what we've run into with Workbox's routing, tend to crop up in the real-world. I wanted to draw them to your attention so that you could think about them earlier rather than later.

Cross-origin routing

Developers sometimes want to route cross-origin requests. And sometimes they don't. Coming up with a syntax for conditions that supports both scenarios can be difficult. It looks like the current proposal assumes that the URL prefix/suffix will only work for same-origin requests, so be prepared for folks asking for a cross-origin syntax at some point.

Workbox has a few different ways of specifying routing conditions, but the most common is RegExp-based, and we settled on the following behavior: if the RegExp matches the full URL starting with the first character [the 'h' in 'https://...'] then we assume that it can match cross-origin requests. If the RegExp matches, but the match starts on anything other than the first character in the full URL, then it will only trigger if it's a same-origin request.

I would imagine folks wanting to see at least a RouterIfURLOrigin condition that they could use to customize this behavior, and that might need to support wildcards to deal with CDN origins that don't have fixed naming conventions.

Non-200 OK responses sometimes need to be treated as errors

The proposal for RouterSourceNetwork currently reads If response is not an error, return response. I think some developers are going to find this too limiting. There are folks who will end up needing different behavior based on the status of the response, not just whether a NetworkError occurred.

I'm not sure what the cleanest approach is there in terms of your proposal—maybe adding in a way of setting a list of "acceptable" status codes, including letting folks opt-in or opt-out to status code 0 from an opaque response being considered a success?

jakearchibald commented 5 years ago

I've renamed prefix/suffix to starts/ends to match str.startsWith.

jakearchibald commented 5 years ago

I wrote https://jakearchibald.com/2019/service-worker-declarative-router/ to seek wider feedback.

Also see my tweet about it for replies.

jakearchibald commented 5 years ago

@jeffposnick

Developers sometimes want to route cross-origin requests. And sometimes they don't.

router.add(
  [
    new RouterIfURLStarts('https://photos.example.com/'),
    new RouterIfURLEnds('.jpg', { ignoreSearch: true }),
  ]
  // …
);

The above would match on URLs that start https://photos.example.com/ and end .jpg. It gets trickier if you want to match URLs to all other origins that end .jpg, you'd need RouterNot for that.

RegExp is tricky here as there's no spec for how it could outlive JavaScript. We could try and standardise globbing, but I worry that would take a big chunk of time.

Non-200 OK responses sometimes need to be treated as errors

I've added requireOkStatus as an option.

nhoizey commented 5 years ago

Hi @jakearchibald, this is really interesting!

Reading the post on your blog, I wondered how I could add the RouterSourceNetwork to the cache, because it looks like the fetch event would not be fired. But I see here that you suggest adding RouterCacheResponse to a future "v2" version.

IMHO, this would be great in the "v1" to help build offline experiences without manually preloading.

But I understand there must be some priorities. 😅

jakearchibald commented 5 years ago

Yeah, we want to avoid trying to do everything at once. In terms of fetching and caching, see https://jakearchibald.com/2019/service-worker-declarative-router/#routersourcefetchevent.

WORMSS commented 5 years ago

This is not going to be a popular view. but I am not a massive fan of the 'string' shortcut of the conditions. I understand you want it to be as close to existing apis like express.. but not everyone understands express all the time.
I was really enjoying the very explicit class approach and the everything in the conditional list must eval to true, rather than this or thistoo or that.

I wondered if there could be some optimisation for that, behind the scenes. Since they all have to be true to be considered true, then the order they are executed could be arbitrary? (I am of course assuming they are all stateless operations).

So ones that are a little more expensive to calculate could be done last, and super dirty cheap ones could be done first? Rather than do them in the order the service worker coder wrote them in ?? That way different browsers can optimise in their own way and what might be expensive for one is cheap for the others, and the developer doesn't need to know this when setting up the conditions.

For example,

router.add(
  [
    new RouterIfURLMatchesRegex('lets assume some horrible regex'),
    new RouterIfMethod('GET'),
  ]
  // …
);

I know RouterIfURLMatchesRegex doesn't exist, but you have to believe that it would be far less expensive to do the method condition before a possible addition to the conditions in the future. Rather than try and resolve the regex, just to find it was a POST request anyway, so didn't even need it.

jakearchibald commented 5 years ago

@WORMSS

I am not a massive fan of the 'string' shortcut

I see what you're saying about the string thing. Also, it isn't all that similar to express. It might be better to drop that shortcut until something like globbing can be properly spec'd.

I wondered if there could be some optimisation for that, behind the scenes. Since they all have to be true to be considered true, then the order they are executed could be arbitrary?

In the case of multiple conditions, I'd spec them as a sequence, but if a browser decided to check them in a different order, or in parallel, there'd be no observable difference.

jakearchibald commented 5 years ago

Some feedback I've received offline:

The string shortcut might lead developers to think something like /articles/*.jpg would 'work'. Also, it wouldn't be backwards compatible to add support for that later. It might be better to hold off on that shortcut, and later explore standardising globbing, which would be useful in other parts of the platform.

When conditions and sources are sequences, conditions is an 'and', whereas sources is more like an 'or'. This is weird. It might be better to drop sequences here in favour of explicit grouping like RouterIfAny(...conditions), RouterIfAll(...conditions), RouterSourceFirst(...sources), RouterSourceRace(...sources) etc etc.

yoavweiss commented 5 years ago

My feedback:

I've seen cases where the request destination would have been extremely helpful as a filter.
I found the use of get() as a shortcut for GET requests confusing as first, as I expected it to be a way to get the set routes or something similar. Maybe it's just me, but if not, might be worthwhile to rename.

shortercode commented 5 years ago

Really liking the concept of avoiding the overhead of starting the ServiceWorker where possible, but I feel like not adding an additional Source type that utilises a callback is a missed opportunity. It would effectively be the same behaviour as the fetch event fallback, but specific to the given conditions. For all other source types we would still avoid starting up the service worker.

Also I think syntax wise I feel it reads better to have static methods on a RouterCondition/RouterSource class that instantiate the specific type. Although I'm not sure how in keeping that is with other web APIs.

const { router } = e;

router.get(
  RouterCondition.startsWith("/avatars/"),
  [ RouterSource.cache(), RouterSource.network() ]
);

router.get(
  RouterCondition.fileExtension(".mp4"),
  RouterSource.network()
);

router.get(
  RouterCondition.URL("/"),
  RouterSource.cache("/shell.html")
);

router.add(
  RouterCondition.any(),
  RouterSource.custom(fetchEvent => {

  })
);

tomayac commented 5 years ago

@yoavweiss: I think the .get() is heavily inspired by Express.js’ routing: http://expressjs.com/en/guide/routing.html.

domenic commented 5 years ago

I apologize that this is not a very substantial contribution, but I think I'm having an allergic reaction to the "Java-esque" (or Dart-esque) nested class constructors. I'd encourage thinking about what this design would look like if done in the opposite direction: a purely JSON format. For example, something like

router.route([
  {
    condition: {
      method: "GET",
      urlStartsWith: "/avatars"
    },
    destination: ["cache", "network"]
  },
  {
    condition: {
      method: "GET",
      urlPathEndsWith: ".mp4"
    }
    destination: ["network"]
  },
  {
    condition: {
      method: "GET",
      urlPath: "/"
    },
    destination: [{ cache: "/shell.html" }]
  }
]);

I don't think this extreme is right either (in particular the [{ cache: "/shell.html" }] seems weird and underdeveloped) but I think it'd be a valuable exercise to see what it would look like to have a purely declarative routing format expressed just in JS objects. Then, you could figure out where it would make sense to strategically convert some POJSOs into class instances, versus where the POJSOs are simpler.

Another way to think about this is, when are these types ever useful? If they're only ever consumed by the router system, and not manipulated, composed, or accessed by users, then perhaps it's better to just pass the data they encapsulate directly to the system. (That data is roughly a "type tag" plus their constructor arguments.)

jakearchibald commented 5 years ago

I'm similarly not keen on the use of classes.

As far as I know, WebIDL doesn't let an object's type be conditional on the value of a previous argument. This makes it tricky when it comes to source options. But you might be able to do it with object keys like above. I'll have a play.

The main thing I'm worried about is feature detection.

jakearchibald commented 5 years ago

@shortercode You probably saw my reply on my blog, but I'll post it here for others:

I feel like not adding an additional Source type that utilises a callback is a missed opportunity

I explored that idea, but here's the issue, (spooky music) the callback ceases to exist. It only ever exists when the install event callback is called. Then later, the service worker closes, the callback is gone. Even later, when the service worker is started again, the install event callback isn't called again, so the callback still doesn't exist (and even if we called that callback again, the callback would be a different instance).

To overcome this, the callback would need to be somewhere where it'd exist every time the service worker runs, which pretty much means the top level. But how would the service worker be able to tell between two instances of the function (since a new instance is created each time the service worker is booted)? Ideally it'd be associated with something primitive, like a string. This is how events work, so you can see how I arrived at RouterSourceFetchEvent as the next best thing.

I think syntax wise I feel it reads better to have static methods on a RouterCondition/RouterSource class that instantiate the specific type

I explored this too. What does RouterCondition.startsWith("/avatars/") return? Does that type have a constructor? That's how I landed at constructors.

nhoizey commented 5 years ago

Yeah, we want to avoid trying to do everything at once.

I understand.

In terms of fetching and caching, see https://jakearchibald.com/2019/service-worker-declarative-router/#routersourcefetchevent.

Ah, yes, sorry, I forgot about this possibility.

richardkazuomiller commented 5 years ago

I like this!

I often find when I'm working with changing behavior based on the URL in a Service Worker, it's easier to make an instance of URL and look at the parts instead of the original string. It would be nice to have conditions for the pathname, hostname, protocol, query string, etc. as well as other parts of the request. I also agree that there could be an easier way to set the condtions than a constructor.

For example:

router.add(
  [
    // same as the RouterIfURLStarts example
    new RouterCondition.url.pathname.startsWith('/avatars/'),
    // true if url.hostname === 'example.com'
    new RouterCondition.url.hostname.is('example.com'),
    // everything should be HTTPS anyway, but since we're here
    new RouterCondition.url.protocol.is('https:'),
    // Skip this route if a queryString is set
    new RouterCondition.url.search.isEmpty(),
    // User has logged in
    new RouterCondition.request.headers.has('X-Authentication-Token'),
    // I'm sure you GET the idea
    new RouterCondition.request.method.is('GET')
  ],
  [
    // ...sources
  ],
)

You could also make it so conditions could be easily chained together, similar to how test frameworks do assertions.

const profilePictureConditon = RouterCondition.url.pathname
  .startsWith('/users/')
  .endsWith('/profile.jpg');
const profilePageCondition = RouterCondition.url.pathname
  .startsWith('/users/')
  .endsWith('profile.html');
const dateCondition = RouterCondition.date
  .from(someDay)
  .to(anotherDay);
const localizedApiCondition = RouterCondition.url.hostname
  .startsWith('api.')
  .endsWith('.jp')
const globalApiCondition = RouterCondition.url.hostname
  .startsWith('api.')
  .endsWith('.com')

This would make code easier to read and I think makes more sense than constructors taxonomically.

All of the RouterCondition.something.something()s would return an instance of RouterCondition. To play devil's advocate in favor of constructors, I suppose it could have a constructor ...

const conditon = new RouterCondition({
  url: {
    hostname: 'example.com',
    pathname: {
      startsWith: '/avatars/'
    },
    protocol: 'https:',
    search: ''
  },
  request: {
    method: 'GET',
    hasHeaders: ['X-Authenticaton-Token']
  }
});

... but if somewhere down the line a new condition gets added, I don't know how one would go about checking whether it's supported, so I'd prefer a completely constructor-less API.

(Edit: router.get and GET condition were redundant)

jakearchibald commented 5 years ago

I'm not keen on the use of constructors in this proposal, but I didn't think there was another way. I'd like to explore it further, and at least show my working.

router.add(conditions, sources);

Conditions

This could be a simple object:

router.add({
  method: 'GET',
  url: { endsWith: '.mp4', ignoreSearch: true },
  date: { to: Date.now() + 1000 * 60 * 60 * 24 * 5 },
}, sources);

This seems pretty nice and easy to spec. You could also add some sensible defaults like method: 'GET', where the developer would need to set this to an empty string if they wanted to cover all methods.

I'm worried about feature detection. The usual trick is to give the API an object with a getter, and see which properties are read. It isn't a great user experience, and especially horrible in this case as you'd have to add a dummy route just to feature detect. I think we'd want something like:

router.supportsCondition('date', { to: Date.now() }); // boolean

In terms of extensibility, here's 'or'…

router.add({
  url: { endsWith: '.mp4', ignoreSearch: true },
  or: {
    url: { endsWith: '.jpg', ignoreSearch: true },
    or: {
      url: { endsWith: '.gif', ignoreSearch: true },
    },
  },
}, sources);

// vs:

router.add(
  new RouterIfAny(
    new RouterIfURL({ endsWith: '.mp4', ignoreSearch: true }),
    new RouterIfURL({ endsWith: '.jpg', ignoreSearch: true }),
    new RouterIfURL({ endsWith: '.gif', ignoreSearch: true }),
  ),
  sources,
);

…and 'not':

router.add({
  method: '',
  not: { method: 'POST' },
}, sources);

// vs:

router.add(
  new RouterNot(new RouterIfMethod('POST')),
  sources,
);

I was worried it might be difficult to figure out the order of operations, but it seems to read ok, provided we can have something like supportsCondition.

If we support 'or' and 'not', we'd probably want 'and' too:

router.add({
  method: 'POST',
  and: {
    url: {
      endsWith: '.mp4', ignoreSearch: true,
      or: {
        url: { endsWith: '.gif', ignoreSearch: true },
      },
    },
  },
}, sources);

Sources

Unlike conditions, the order matters here. It could be:

router.add(conditions, [
  // Where each source is a enum string:
  'network',
  // Or an array of [source, options],
  ['cache', { request: '/shell.html' }],
]);

It's a bit of a weird convention, but I guess using classes is weird too. But, is it possible to spec this? The sticking point seems to be [enumValue, options] where the type of options depends on the value of enumValue. This is pretty easy to do in TypeScript, but I don't think it's possible in WebIDL (@domenic, am I correct here?).

This has the same feature detection problem, so we'd also want:

router.supportsSource('cache', { request: '/shell.html' }); // boolean

If the WebIDL thing becomes a sticking point, sources could be a sequence of enums or RouterSource instances:

router.add(conditions, [
  'network',
  new RouterSourceCache('/shell.html'),
]);

If every enum had an equivalent constructor, we wouldn't need router.supportsSource.

Are instances useful?

If we stuck with classes (in either a full or partial way as mentioned above), source instances could have a method to perform their action, and condition instances could have a way to test them.

const condition = new RouterIfURL({ startsWith: '/article/' });
condition.test(request); // boolean
const source = new RouterSourceCache({ ignoreSearch: true });
const response = await source.doYourThing(request);

This might be useful when writing tests, but I'm not sure if it's useful beyond that.

Examples

Taking the initial examples from my blog post:

router.get(
  new RouterIfURLStarts('/avatars/'),
  [new RouterSourceCache(), new RouterSourceNetwork()],
);

// becomes:

router.add(
  { url: { startsWith: '/avatars/' } },
  ['cache', 'network'],
);

router.get(
  new RouterIfURLEnds('.mp4', { ignoreSearch: true }),
  new RouterSourceNetwork(),
);

// becomes:

router.add(
  { url: { endsWith: '.mp4', ignoreSearch: true } },
  'network',
);

router.get(
  new RouterIfURL('/', { ignoreSearch: true }),
  new RouterSourceCache('/shell.html'),
);

// becomes:

router.add(
  { url: { matches: '/', ignoreSearch: true } },
  new RouterSourceCache('/shell.html'),
);

// or:

router.add(
  { url: { matches: '/', ignoreSearch: true } },
  [['cache', { request: '/shell.html' }]],
);

That final example looks a bit weird, so I'm leaning towards keeping classes for sources, but allowing a string in cases where options aren't needed.

shortercode commented 5 years ago

@jakearchibald

I feel like not adding an additional Source type that utilises a callback is a missed opportunity

I explored that idea, but here's the issue, (spooky music) the callback ceases to exist. It only ever exists when the install event callback is called. Then later, the service worker closes, the callback is gone. Even later, when the service worker is started again, the install event callback isn't called again, so the callback still doesn't exist (and even if we called that callback again, the callback would be a different instance).

Ah I totally get you, mmm there's no good workaround for that really. Which is a shame. I can imagine the condition matching behaviour being duplicated in a user land library, just to provide a callback style source. I actually already have a project which could benefit from that.

I think syntax wise I feel it reads better to have static methods on a RouterCondition/RouterSource class that instantiate the specific type

I explored this too. What does RouterCondition.startsWith("/avatars/") return? Does that type have a constructor? That's how I landed at constructors.

I was thinking it would either return a new instance of RouterCondition, with a { startsWith: "/avatars/" } configuration or an instance of RouterIfURLStarts ( which would inherit/implement RouterCondition ). RouterCondition itself would be exposed on serviceworkerglobalscope but as an illegal constructor.

If I was to polyfill the feature in JS:

class RouterCondition {

    constructor (fn) {
        this._test = fn;
    }

    _check (fetchEvent) {
        return !!this._test(fetchEvent);
    }

    static startsWith(str) {
        return new RouterCondition(fetchEvent => {
            const url = new URL(e.request.url);
            return url.pathname.startsWith(str);
        });
    }

}

Reading feedback from @domenic I agree that it does feel somewhat over the top to have a dozen userland classes for what is effectively immutable configuration information. Can anyone think of any methods that Conditions or Sources would actually need? The type object pattern would probably work here.

The static method suggestion I'm putting forward helps with feature detection, and stops developers needing to go Full Java. But I admit your latest suggestion reads very cleanly.

router.add(
  { url: { matches: '/', ignoreSearch: true } },
  new RouterSourceCache('/shell.html'),
);

jakearchibald commented 5 years ago

@shortercode

I was thinking it would either return a new instance of RouterCondition, with a { startsWith: "/avatars/" } configuration or an instance of RouterIfURLStarts ( which would inherit/implement RouterCondition ). RouterCondition itself would be exposed on serviceworkerglobalscope but as an illegal constructor.

RouterCondition isn't specific enough, so it'd have to be RouterIfURLStarts. But then, why have a method that returns an instance + an illegal constructor, when a valid constructor can do the same job. That's how I ended up using constructors.

Can anyone think of any methods that Conditions or Sources would actually need?

Yep, I documented some above. See "Are instances useful?".

shortercode commented 5 years ago

[...] why have a method that returns an instance + an illegal constructor, when a valid constructor can do the same job[...]

Well it's not entirely without precedent in Web APIs, BaseAudioContext.createBuffer for example ( although AudioBuffer is a valid ctor as well ). You are probably right about just using valid constructors though, I think the difference basically comes down to my preference in namespacing methods/classes.

Can anyone think of any methods that Conditions or Sources would actually need?

Yep, I documented some above. See "Are instances useful?".

Missed that, going to be honest I only scanned that post over my morning red bull. I guess condition instances could also be used for routing in the fetch handler.

const markdownCondition = new RouterIfURLEnds('.md');

addEventListener("fetch", event => {
  if (markdownCondition .test(event)) {
    event.respondWith(renderMarkdown(event));
  }
});

Not a great example use case for serviceworkers, but a plausible one.

Pajn commented 5 years ago

This might be a bit bikeshedding but I'm not very keen on the matching of the URL as a string and would much prefer an API similar to what @richardkazuomiller proposes. URL parsing is hard and having to disable query parameters to match on the path feels very fragile. What if there is a hash in the url for example?

jakearchibald commented 5 years ago

@Pajn

The difference between service worker and other routing systems is that service worker handles requests to other origins. If you create a system that only matches on URL path, would you expect that to match on any origin?

At some point it'd be nice to standardise Express-style path matching, but I don't think we'll do that as part of v1. The benefit of { url: { startsWith: '/articles/' } } is we can resolve that string against the service worker's URL, as we do with URLs passed to fetch().

We could also add things like { url: { pathStartsWith: '/foo/' } } to match against particular components of the URL, but you'd be matching against all origins unless you specify otherwise.

What if there is a hash in the url for example?

That's a good question. We flip-flopped on that a bit and I can't remember where we landed.

domenic commented 5 years ago

This is pretty easy to do in TypeScript, but I don't think it's possible in WebIDL (@domenic, am I correct here?).

Right, you'd need to add a bit of prose. The type would be sequence<EnumType, OptionsDictionaryType> and then you'd use prose to validate that the sequence is length 2, and each index is as expected.

jakearchibald commented 5 years ago

@domenic the problem is more that the type of OptionsDictionaryType is dictated by the value of EnumType.

router.add(conditions, [
  // Valid
  ['network', { request: '/' }],
  // Valid
  ['cache', { ignoreSearch: true }],
  // Invalid. ignoreSearch is not an option for the network source.
  ['network', { ignoreSearch: true }],
]);

domenic commented 5 years ago

Yeah, so either OptionsDictionaryType would be a union, or you would just use object and convert using prose.

But even then, the "invalid" should not cause an error; it just gets ignored. That's how it would work with constructor options as well.

I'll note that you could also consider the design

[
  { type: 'network', request: '/' },
  { type: 'cache', ignoreSearch: true },
  { type: 'network' }
]

which will still require some prose I think, but is a bit less weird.

annevk commented 5 years ago

(Note that there's some precedent for object and conversion to the appropriate dictionary decided based on another argument in prose. https://github.com/heycam/webidl/issues/568 has some pointers. At some point there might be enough for an abstraction to be added to IDL.)

jeffposnick commented 5 years ago

In case prior art is useful, here's the syntax that Workbox uses for expressing runtime routing information inside of a JS object. This is read by Workbox's build tools and translated into calls to the underlying Workbox libraries when generating a service worker file.

That syntax has a lot of baggage from the earlier syntax used by sw-precache, which in turn had baggage due to how sw-toolbox was implemented. So... I'm not suggesting that it's actually something that should be used as-is. Maybe as a counter-example.

Also, I could be wrong here, but a syntax based on object properties rather than one based on classes seems like it would offer more flexibility when it comes time to write the inevitable JavaScript polyfill.

jakearchibald commented 5 years ago

@domenic

I'll note that you could also consider the design…

I generally try to avoid objects that mix required properties (type) with optional ones. I realise we do this all over the platform though.

AmeliaBR commented 5 years ago

I'm worried about feature detection. [ @jakearchibald re using a configuration object instead of classes and method calls]

What if the API contained a setConfig / getConfig pair, and the getConfig drops any unrecognized properties? So the initial server worker script could run a quick test case and then decide which configuration file to load.

Alternatively, the setConfig method could take an options parameter that defined error-checking rules: does the setter throw an error for unrecognized configuration rules/properties, or does it silently drop them, or make that rule trigger the service worker, or…?

hewholived commented 5 years ago

@jakearchibald I like the second proposal of using a condition object compared to list of constructors. I suggest creating RouterCondition class and a RouterPredicate class to help better express the conditions. RouterCondition class instantiates a self contained condition object that is passed to router.add function. It also exposes a test method that can be used to validate whether a request satisfies the condition. RouterPredicate class can list all the predicates that are allowed on a particular route.

const condition = new RouterCondition((route /* of type RouterPredicate */) => {
  return route.method('GET') &&
         route.urlStartsWith('/avatar') && 
         !route.urlEndsWith('.mp4');
// return !route.method('POST');
});

router.add(condition, sources);
condition.test(request); // returns boolean; useful for other purposes

What do you think?

jakearchibald commented 5 years ago

@AmeliaBR

What if the API contained a setConfig / getConfig pair…

You'd also need a way to remove the route if it no longer make sense with particular features unsupported. Seems that the supportsCondition API might be simpler?

jakearchibald commented 5 years ago

@hewholived unfortunately that can't work for the same reasons as https://github.com/w3c/ServiceWorker/issues/1373#issuecomment-452222806