w3c / ServiceWorker

Service Workers
https://w3c.github.io/ServiceWorker/
Other
3.63k stars 313 forks source link

Add async iteration to the cache API #1066

Open ConradIrwin opened 7 years ago

ConradIrwin commented 7 years ago

Retreiving all the keys on a large cache can be very very slow, or in some browsers (as filed against Chrome https://bugs.chromium.org/p/chromium/issues/detail?id=688268) can just throw an exception.

This happened to us as our cache pruning algorithm uses the keys of the cache to determine which files are safe to delete and which must be kept. (It's perhaps an odd use-case, we're building an email client in the browser, and the attachment cache stores not only downloaded attachments which we can trivially re-download, but also uploaded attachments for which it is currently the primary source; we'll move the latter out so we can just call caches.delete(), but this seems likely to bite other people after a while in production).

jakearchibald commented 7 years ago

I've been putting off thinking about this kind of stuff until async iteration lands. Would that help you here?

As in, for await (const cache of caches)

wanderview commented 7 years ago

I agree that at some point the API will fail to scale, but I think some of the issues here are down to implementation. For example, I wrote a test that isolates the read time from the write time and see this in firefox:

1) browser does not crash 2) cache.keys() takes ~900ms to complete

Your current test case did not wait for the cache.put() writes to complete before starting the cache.keys().

This is the code I ran:

function fill(cache) {
  let list = [];
  for (let i = 0; i < 30000; ++i) {
    list.push(cache.put('https://example.com/probably-crash-' + i, new Response('ok')))
  }
  return Promise.all(list);
}

function read(cache) {
  let start = performance.now();
  cache.keys().then(keys => {
    let end = performance.now();
    console.log(end - start);
  })
}

caches.open('foo').then(cache => { fill(cache).then(_ => read(cache) ) })
ConradIrwin commented 7 years ago

Thanks @wanderview, I've updated the test here: https://rawgit.com/ConradIrwin/a634359060cbbde0e90a30e5744aa1a1/raw/94c524dcc451f7104b5fdf8999a56cc70db86ff2/storage.html (the time was actually less of a concern for me, as it's a relatively rare async event, the crash was the big problem in chrome).

@jakearchibald What does that syntax desugar to? In general happy to manually iterate over the collection, but the API doesn't support that yet.

function paginate(cache, callback, offset) {
  return cache.keys(null, {limit: 1000, offset: offset || 0}).then((keys) =>
    for (key in keys) {
      callback(key);
    }
    if (keys.length === 1000) {
      return paginate(cache, callback, offset + 1000)
    }
  }
}
jakearchibald commented 7 years ago

https://github.com/tc39/proposal-async-iteration#async-iterators-and-async-iterables

Similar interface to iterators, but returns promises.

jakearchibald commented 7 years ago

Renaming, because I'm pretty sure async iteration is the right solution here. Objections welcome!