Using BackgroundSync for purely "processing" tasks

jakearchibald commented 9 years ago

I've been against using background sync for pure processing tasks because of battery usage and effecting the performance of whatever app is running at the time, but maybe I'm being too conservative & it's something we should add post-MVP (and ensure it can be added in a sensible way).

https://github.com/slightlyoff/BackgroundSync/issues/70 shows a usecase from Mozilla - processing emails into threads.

Cache optimisation & cleanup is another use-case.

These tasks should only happen when the device is idle, to avoid janking other apps, and avoid battery draining.

I only have one-off use-cases at the moment, but perhaps there are others?

For the Mozilla case, from the page:

navigator.serviceWorker.ready.then(reg => {
  return reg.periodicSync.register('check-email');
});

From the ServiceWorker:

self.addEventListener('perodicsync', event => {
  event.waitUntil(
    checkEmail().then(emails => {
      return filterOutAlreadyCachedEmails(emails);
    }).then(newEmails => {
      if (!newEmails.length) return;

      self.registration.showNotification("You've got mail", {
        tag: "new-email",
        icon: "/email.png"
      });

      return storeEmails(newEmails).then(function() {
        self.registration.processingSync.register({tag: 'emails'});
      });
    })
  )
});

function processEmail(email) {
  return createThread(email).then(thread => {
    return caches.open('email-threads').then(cache => {
      return cache.put(thread.url, thread.createResponse()));
    }); 
  });
}

function processEmails() {
  return getUnprocessedEmails().then(emails => {
    return Promise.all(
      emails.map(email => processEmail(email))
    )
  })
}

self.addEventListener('processingsync', event => {
  if (event.tag == 'emails') {
    event.waitUntil(processEmails());
    return;
  }
});

self.addEventListener('fetch', event => {
  event.respondWith(
    caches.match(event.request).then(request => {
      if (request) return request;

      if (isThreadedEmailRequest(event.request)) {
        return processEmails().then(_ => caches.match(event.request));
      }
    })
  );
});

The aim here is to allow the periodic sync to go to the network & cache, and have a separate "processing" sync that will do the heavy lifting while the device is charging & idle. If the email thread is requested before processing gets a chance to complete, processing happens just in time.

annevk commented 9 years ago

What is the use case? Parsing and storing email but not downloading it?

jakearchibald commented 9 years ago

The feedback I got from https://github.com/slightlyoff/BackgroundSync/issues/70 is that sorting emails into threads was the heavy JS work that would require a different worker - is this right @wanderview @jrburke?

jrburke commented 9 years ago

I posted a bit more on why we want to use a shared worker for email data layer, for managing the network connections and JS data object creation and IDB storage in this comment in issue 70.

Mainly: we want to use that model even without backgroundsync in play. backgroundsync just is another way to trigger the email sync work, in addition to the UI button the user can tap to manually start a sync.

On the topic of backgroundsync should only do enough to sync the email data from network and use a "processingsync" to process it more later:

That would mean loading a lot of the same code the shared worker would have already, for the network stuff, and result in more IDB overhead since it implies a separate DB just for the "raw" data received from the network, and I do not believe it saves that much more time doing a bit more processing to properly convert the raw data to the JS object data used by the app.

Particularly since the sync can also trigger notification to the user which can then lead to the launch of the email app. The user has asked for background email sync to get timely notifications of new emails. Splitting the sync work into two different pieces, with the guarantees on "processingsync" runs being different than the backgroundsync runs seem to make it harder to understand when the user will see notifications.

Instead of introducing another sync entry point, if the concern is that some backgroundsyncs could be expensive and block UI work, it seems like the browser can decide how to manage those tasks better: it can track how long syncs for a given app take, and if taking longer, delay their operation to better times, give that service worker less priority, maybe even pause it. Those are not concrete suggestions, just illustrating that the browser, as the user agent for the user, can decide to regulate background tasks to optimize UI concerns.

I believe this will be needed anyway because with two sync entry points, there is no guarantee that developers will be disciplined to split the work between two sync tasks. In the email case, it just seems like more work/logic to code and debug, and more code for the app to load, particularly if service workers cannot spin up the shared worker.

jakearchibald commented 9 years ago

Thanks @jrburke! Yeah, I think this is better left to scheduling.

How heavy is the processing work that Mozilla's email client does? As in, how long does it lock the thread up for? Are we talking seconds?

jrburke commented 9 years ago

Using a dual core phone on wifi, talking to gmail imap for an account that is already synced, the background sync (via request-sync which is not service worker backed yet, but plans are to move to SW-backed backgroundsync) the sync data flow without adding any new messages, only confirming sync is up to date, is around 2.3 seconds.

That includes some IMAP work to download some data to compare to locally synced state (not just a simple ping), and loading some IDB state to compare. So that is a kind of baseline. That work is done on a worker right now, so not on the main thread for the email UI.

Doing a quick test with a 5 message sync took just under 5 seconds. We are in the middle of overhauling the worker code though to support more things like conversations, and likely some specializations per imap server type, so I can see that number improving.

WICG / background-sync

Using BackgroundSync for purely "processing" tasks #77