whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.1k stars 325 forks source link

How response bodies get from a service worker to the main page is not very clear #330

Closed domenic closed 8 years ago

domenic commented 8 years ago

Right now the spec says that the internal response has a ReadableStream object. Since the internal response is shared between the service worker and the page, the ReadableStream is implicitly shared.

This is not very good. For example it implies sharing arbitrary JavaScript objects, since you can enqueue arbitrary JavaScript objects in the readable stream. Right now Chrome ignores all objects except Uint8Arrays. It should instead error in some way, but exactly how is not clear, since there is no spec.

At first I thought this would be related to structured cloning of readable streams (https://github.com/whatwg/streams/issues/276 and https://github.com/whatwg/streams/issues/244) but that might not be true. Our plan for structured cloning was to lock the stream. But that will not work in this case, as we want the service worker code to be able to enqueue at the same time the main thread code is reading.

I guess what we need to do is specify how each Response object has a ReadableStream object. Then respondWith() roughly does the following:

Apologies if we already have some of this and I missed it.

yutakahirano commented 8 years ago
annevk commented 8 years ago

Transfer means copying and then detaching on the side you copied from.

yutakahirano commented 8 years ago

That means a script author will lose the access to the buffer wrapped by the Uint8Array object passed to controller.enqueue. Is it desirable?

annevk commented 8 years ago

I'm not sure. Given how requests and responses work with streams today it kind of makes sense to me that if you want to retain a value you clone it first. But we do need to make it deterministic when the transfer happens.

yutakahirano commented 8 years ago

Another point:

We (Chrome) serialize the buffers to a C++ byte array and create Uint8Array objects from the byte array in the page. That means buffer's boundary is not preserved (i.e., buffers may be merged / split). As an implementer I would like the spec to allow us doing it.

yutakahirano commented 8 years ago

@wanderview fyi

wanderview commented 8 years ago

This is not very good. For example it implies sharing arbitrary JavaScript objects, since you can enqueue arbitrary JavaScript objects in the readable stream. Right now Chrome ignores all objects except Uint8Arrays. It should instead error in some way, but exactly how is not clear, since there is no spec.

I agree we need to spec types. Can we do something in webidl like ReadableStream<Uint8Array> with defined semantics if it hits a different chunk type?

At first I thought this would be related to structured cloning of readable streams (whatwg/streams#276 and whatwg/streams#244) but that might not be true. Our plan for structured cloning was to lock the stream. But that will not work in this case, as we want the service worker code to be able to enqueue at the same time the main thread code is reading.

I don't understand this one. Just because the outer ReadableStream is locked does not prevent the inner source from continuing to provide data, right?

If we're using a Transform like a pipe, then the readable and writable sides really need to be individually lockable. Locking the reader side should not block other code from locking/using the writable side. This should be achievable via the inner source object.

Otherwise, transfer it across realms/event loops, and enqueue it in the page-side ReadableStream.

I guess we can do this, but its likely just going to be a spec thing. I doubt we will ever represent actual browser images, stylesheets, etc as ReadableStreams. We're just going to consume them in c++. The only place it will actually happen this way is in an outer fetch() call that returns a Response.body.

We (Chrome) serialize the buffers to a C++ byte array and create Uint8Array objects from the byte array in the page. That means buffer's boundary is not preserved (i.e., buffers may be merged / split). As an implementer I would like the spec to allow us doing it.

Yea, I think its totally up to the browser what they do with the buffers. I don't want the spec to require a one-to-one buffer correspondence between a service worker provided Response.body and the outer fetch()'s Response.body that triggered the service worker interception.

domenic commented 8 years ago

That means a script author will lose the access to the buffer wrapped by the Uint8Array object passed to controller.enqueue. Is it desirable?

I think that's fine. Once you have enqueued a buffer, the consumer is allowed to read it and modify it arbitrarily (including transfering and detaching it). In this case the consumer is the UA.

We (Chrome) serialize the buffers to a C++ byte array and create Uint8Array objects from the byte array in the page. That means buffer's boundary is not preserved (i.e., buffers may be merged / split). As an implementer I would like the spec to allow us doing it.

This is an important point. I think that fits easily into the framework I outlined above. We just need to be a bit careful how things are specced. Instead of directly reading and enqueuing chunks as objects, we need to add some wiggle room in that process. Read the chunk as a Uint8Array, detach it, and enqueue new Uint8Array objects that point to the same sequence of bytes in the same order.

I agree we need to spec types. Can we do something in webidl like ReadableStream<Uint8Array> with defined semantics if it hits a different chunk type?

I am not sure how this would work. I guess my instinct is to first formalize the semantics and then see if any of them are boilerplatey enough that they could be taken care of by a Web IDL type / by bindings generators.

I don't understand this one. Just because the outer ReadableStream is locked does not prevent the inner source from continuing to provide data, right?

If by "inner source" you mean "underlying source", that's under the control of the author, not the UA.

What I mainly meant here was that we couldn't just keep doing what the spec currently does, and pretend there's a single ReadableStream object which gets transferred around.

I guess we can do this, but its likely just going to be a spec thing. I doubt we will ever represent actual browser images, stylesheets, etc as ReadableStreams. We're just going to consume them in c++. The only place it will actually happen this way is in an outer fetch() call that returns a Response.body.

Basically there are two options. We could create a parallel "conceptual stream" type, with all the appropriate operations (reading, enqueueing, etc.) and say that is what is in play most of the time. Then the response.body getter lazily initializes a real ReadableStream, and there's some procedure for coping the "conceptual stream" into the ReadableStream so that operating on the latter impacts the former and vice versa.

Or we could keep doing what the current spec does, and use a ReadableStream to represent the streaming body in all cases. We just need to be a bit more careful about how things work across realms and event loops.

The latter sounds better to me.

yutakahirano commented 8 years ago

Thank you.

I would like to use a transform stream to explain that in the spec. Let me try:

Let {ws, rs} be a transform stream whose transform function

Note that each Uint8Array object's boundary needs not to be preserved, but the concatenation of all written bytes to ws needs to equal to the concatenation of all bytes that would be read from rs.

In respondWith, we need to create a response r' from the given response r by setting r''s body stream to rs and calling pipeTo on r's body stream with ws.

Does that make sense? Am I using streams' words correctly?

domenic commented 8 years ago

That all makes sense. Given that we still haven't finished defining writable and transform streams (or piping) work in very much detail, I'm not sure how that's going to translate in the spec. That is why I suggested just describing it in terms of reading from r's body stream and enqueuing into r's body stream. But I guess we could describe it in terms of transform streams, and just keep things sufficiently vague?

yutakahirano commented 8 years ago

Thank you. I made a pull-request https://github.com/slightlyoff/ServiceWorker/pull/934 based on @domenic's original post. I added a note to say that UAs need not preserve buffer's boundary. Hopefully we will replace the description with a more implicit one (i.e., transform stream + pipeTo).