ServiceWorker lifetime and respondWith() with a user-constructed ReadableStream

tyoshino commented 8 years ago

Branched from https://github.com/yutakahirano/fetch-with-streams/issues/63#issuecomment-207453714

respondWith() with a user-constructed ReadableStream

When a user-constructed ReadableStream is being consumed by respondWith(), respondWith() doesn't know what's happening inside the ReadableStream. The ReadableStream interface of the instance facing respondWith() doesn't allow either scraping out its contents or knowing ReadableStreamDefaultController.close() is invoked (note that a ReadableStream doesn't always have ReadableStreamDefaultController behind it).

Until all the chunks are read out from the ReadableStream, it's holding some JS objects representing the underlying source and chunks in its queue. So, we want to keep the ServiceWorker alive until they're drained.

Also, note that only respondWith() can observe the ReadableStream's state when respondWith() is consuming it. It could work but is not good that we tell the instance to wait for to SW like evt.waitUntilClosed(aStream) as Ben suggested at https://github.com/yutakahirano/fetch-with-streams/issues/63#issuecomment-207487623.

evt.respondWith(response, { waitForBody: true }); (https://github.com/yutakahirano/fetch-with-streams/issues/63#issuecomment-207489289) by Ben looks better.

There we could have a timeout to abort receiving chunks even when waitForBody is specified so that badly behaving stream doesn't prolong SW lifetime forever.

respondWith() with a ReadableStream constructed by Fetch API

I agree with Ben's point at https://github.com/yutakahirano/fetch-with-streams/issues/63#issuecomment-207483430. This is also related to the discussion how we realize efficient pipeTo() algorithm. I.e. for certain operations, streams would behave like a simple identifier of data source / destination like file descriptor.

I'll try to standardize all the interfaces for these issues by resuming the pipeTo() discussion threads.

wanderview commented 8 years ago

At the face-to-face today we decided the code providing the js stream must use waitUntil().

You are right, though, there will probably be a window between when waitUntil is resolved and the browser consumes the js buffers. It seems unlikely, though, that draining the buffers would last longer than the grace timer.

Perhaps we should say the service worker should be kept alive long enough to consume the stream. That makes the default behavior to hold the worker alive. It could be an equivalent optimization, though, to let the SW die if a c++ stream is being consumed.

yutakahirano commented 8 years ago

F2F meeting notes: https://lists.w3.org/Archives/Public/public-webapps/2016AprJun/0043.html

domenic commented 8 years ago

Perhaps we should say the service worker should be kept alive long enough to consume the stream. That makes the default behavior to hold the worker alive. It could be an equivalent optimization, though, to let the SW die if a c++ stream is being consumed.

This sounds pretty good to me, if people are OK with it...

wanderview commented 8 years ago

I guess that works as long as there is no way for js to produce a c++ stream that is really sourced from js. For example, write to a file via a js source stream and return the file c++ stream to respondWith(). To be honest, this scares me a little.

Overall, though, I think requiring waitUntil() might be better since it covers all cases. We don't need to add magical keep-alive logic if there is a new way to transfer a stream from js-to-c++. We just need a note that when the extended promises settle the service worker should try to live until all js buffers enqueued in transferring streams are consumed.

jakearchibald commented 8 years ago

Pre F2F notes: Don't think there's anything to discuss here. I think we're happy with developers using waitUntil, but we should make this as easy as possible.

jakearchibald commented 8 years ago

F2F:

We could add the implicit waitUntil later - in the meantime advocate for waitUntil

jakearchibald commented 7 years ago

Here's what I'd like to see here:

self.addEventListener('fetch', event => {
  event.respondWith(
    fetch(event.request)
  );
});

self.addEventListener('fetch', event => {
  event.respondWith(
    caches.match(event.request)
  );
});

self.addEventListener('fetch', event => {
  event.respondWith((async () => {
    const response = await fetch(event.request);
    return new Response(response.body, {
      headers: response.headers
    });
  })());
});

In the above cases, the service worker should be able to terminate once the response has been provided, before the resulting stream has been fully consumed by the browser, without breaking the stream.

So if the response is a 3 hour movie, the SW does not need to stay alive while the user watches the 3 hour movie.

Whereas:

self.addEventListener('fetch', event => {
  const responsePromises = [
    caches.match('/shell-start.inc'),
    fetch(event.request.url + '.inc'),
    caches.match('/shell-end.inc')
  ];

  const {writable, readable} = new TransformStream();

  (async () => {
    for await (const response of responsePromises) {
      await response.body.pipeTo(writable, {preventClose: true});
    }
    writable.close();
  })();

  event.respondWith(
    new Response(readable, {
      headers: {'Content-Type': 'text/html'}
    })
  );
});

…this may not work, as the SW may terminate before all of the pipeTos are called. When a stream is GC'd that cannot complete, we need to figure out if this is an error or a cancellation. My gut feeling is error, as if a server simply cut off its response & reset the connection, but I'd like to hear from @domenic.

To avoid the above, you'd need to do:

self.addEventListener('fetch', event => {
  const responsePromises = [
    caches.match('/shell-start.inc'),
    fetch(event.request.url + '.inc'),
    caches.match('/shell-end.inc')
  ];

  const {writable, readable} = new TransformStream();

  event.waitUntil((async () => {
    for await (const response of responsePromises) {
      await response.body.pipeTo(writable, {preventClose: true});
    }
    writable.close();
  })());

  event.respondWith(
    new Response(readable, {
      headers: {'Content-Type': 'text/html'}
    })
  );
});

Now the waitUntil explicitly holds the SW open for all the stream operations.

I'm curious about this case:

event.waitUntil((async () => {
  for (let i, response; response = await responsePromises[i]; i++) {
    const isLast = i == responsePromises.length - 1;
    if (isLast) {
      response.body.pipeTo(writable);
    }
    else {
      await response.body.pipeTo(writable, {preventClose: true});
    }
  }
})());

Here waitUntil holds the SW open until pipeTo has been called for the last stream, but it doesn't hold it open until the whole stream has been consumed. However, at this point no further JS needs to be executed to complete the stream. We've piped a behind-the-scenes stream to an identity stream, and the identity stream will auto-close when piping completes.

Should the SW be able to close, allowing the stream to complete successfully? @domenic?

jungkees commented 7 years ago

@jakearchibald, let me refer to the examples in your last comment as "fetched streams case", "JS-generated streams case", "JS-generated streams case with waitUntil", and "pipeTo-stream after waitUntil promise resolve case".

I was confused with "fetched streams case" in our chat yesterday. (@jakearchibald's last comment started to clarify that discussion.) The steps that enqueue chunks to fetchEvent's potential response in respondWith(r) run in parallel. So it doesn't hold SW open until the whole body is handled. And in Handle Fetch, SW basically waits until the fetchEvent's wait to respond flag is unset. That means SW stays up until the fetchEvent's potential response's reference is copied to the internal variable that will be returned to fetch. Hence, I don't think we have a problem with this case.

Also, for "JS-generated streams case", "JS-generated streams case with waitUntil" is the pattern that we made our consensus on in the last f2f.

So, I think we can close this issue when "pipeTo-stream after waitUntil promise resolve case" is clarified.

domenic commented 7 years ago

When a stream is GC'd that cannot complete, we need to figure out if this is an error or a cancellation. My gut feeling is error, as if a server simply cut off its response & reset the connection, but I'd like to hear from @domenic.

I'm not sure GCed is the right word here, since the reference is being held on to (by the implicit promise chain created by the for-await-of). I'm also not sure if the right approach here is to, on termination of the service worker, reach into the objects owned and created by the web developer and move them into a new state (i.e. error them).

Maybe the level this should be handled at is when we "pipe" across the boundary to the main thread. (https://w3c.github.io/ServiceWorker/#dom-fetchevent-respondwith step 8.3.2.5.) The ReadableStream created on the main thread is owned by the user agent, who could decide to error it. In the service worker thread, either the ReadableStream should be canceled (using the reader that's being read from), or the thread should just shut down entirely, destroying all objects, not running any more JavaScript code, and forcibly releasing any resources like network socket handles. I guess you would spec that as a cancel anyway.

Then the question becomes whether you want to error the main-thread ReadableStream, or just close it prematurely. I agree this should act as if the server simply cut off its response and reset the connection, but I'm not 100% sure this would result in an error today. But I trust you if you are sure that is the behavior.

I'm curious about this case:

It seems like as written the service worker should be able to close, since you didn't wait for the pipeTo to finish. But whether or not the stream will complete successfully is then up to the browser, I think. If the browser decides to destroy the service worker thread, the behind-the-scenes stream (and the identity transform it is passing through) will be killed, so the process of piping it to the main thread will be interrupted.

jakearchibald commented 7 years ago

F2F: what should happen if the SW terminates before the stream is complete. Error or early close?

jakearchibald commented 7 years ago

F2F: If the stream is "JS created", we should consider keeping the SW alive after a fetch event until the stream has been consumed.

We talked about this in https://github.com/w3c/ServiceWorker/issues/882#issuecomment-235987553, but after observing usage, it seems like we really need this.

@domenic can you post your code example?

domenic commented 7 years ago

With an identity transform stream (easy to understand): https://gist.github.com/domenic/ea5ebedffcee27f552e103963cf8585c
With only readable streams:https://gist.github.com/domenic/e1e2238f1447e6f6783f1fba3ee601a6

jungkees commented 5 years ago

Most cases are taken care of with waitUntil(), but we should still solve/conclude this. We'll address this in the next milestone.

jimmywarting commented 1 year ago

Don't know if there is any Firefox developer in here. but fyi, FF v102 started to support transferable ReadableStream and that broke quite many downloads for many fellow user of my StreamSaver.js lib.

I deliberately didn't do any postMessage hack to keep the service worker alive. cuz chrome could just take a ReadableStream from the main thread, pass it to the service worker and respond with that ReadableStream and then terminate the service worker cuz it was not in use. the service worker didn't try to read or enqueue any data from the stream inside of the service worker. it just forwared the transfered readable stream from the main thread -> service worker -> eventually evt.respondWith(new Response(stream))

i expected that FF would follow suit. but it didn't. the transferable stream detection took over how my script behaved differently. FF would transfer the stream and no postMessages where then sent to the service worker to keep it alive. b/c i tough it would not be necessary. but it was 😞 Firefox killed the service worker after 30s and terminated the response along with it that was instantiated with a ReadableStream coming from the main thread.

now all browser use the postMessage hack to keep the service worker alive.

with that said, can we shine any light on how the service worker should behave if "ServiceWorker lifetime and respondWith() with a user-constructed ReadableStream" should behave?

I would say:

As long as there is any resource (tab or download) that consumes the response should not deliberately terminate the service worker after a hard timeout (it should let it finish, even if it's a 3h long movie that is being streamed for 3h)
if the stream is released and can be piped in the background, then fine go ahead and kill the service worker after x sec.
or fix the #1566 and honor the request of evt.waitUntil as long as there is any tab open.

w3c / ServiceWorker

ServiceWorker lifetime and respondWith() with a user-constructed ReadableStream #882

respondWith() with a user-constructed ReadableStream

respondWith() with a ReadableStream constructed by Fetch API