whatwg / streams

Streams Standard
https://streams.spec.whatwg.org/
Other
1.35k stars 160 forks source link

Add support for ReadableStream "owning" type #1271

Open youennf opened 1 year ago

youennf commented 1 year ago

Implement part of streams-for-raw-video-explainer.md.

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

youennf commented 1 year ago

We can bikeshed whether type: 'transfer' or a new optional flag transfer: true is better.

I went with type: 'transfer' as transfer does not make sense for byob streams. I guess a separate flag might be useful if we anticipate other types where transfer:true would be useful.

youennf commented 1 year ago

A few thoughts:

ricea commented 1 year ago

I went with type: 'transfer' as transfer does not make sense for byob streams.

That makes sense to me.

I guess a separate flag might be useful if we anticipate other types where transfer:true would be useful.

I'm hoping we won't have to add any more controller types, since they are costly in terms of specification and implementation size.

ricea commented 1 year ago
  • The current algorithm tries to transfer/serialize only if it is sure to be able to enqueue. This seems a good property to me.

Agreed.

  • If transfer/serialization fails, we silently terminate without either throwing or erroring the stream. I wonder whether this will be surprising to web developers.

I would lean towards erroring the stream. Silently losing data is something I want to avoid.

  • Cloning a transfer:true stream with transfer-only objects would succeed but the second branch would be errored. This seems ok to me.

It makes me a bit uncomfortable but I think it's acceptable.

  • The current transfer-or-clone algorithm has surprising and potentially undesired effects:

    • enqueue(videoFrame) would make sure to transfer videoFrame but enqueue({ videoFrame }) would clone it.
    • We could introduce a second optional parameter to controller.enqueue, something like enqueue(any message, sequence<object> transfer = []) and update accordingly the StructuredTransferOrClone algorithm. This would make it closer to postMessage. In that case, should we keep enqueue(videoFrame) as a shortcut to enqueue(videoFrame, [videoFrame])?

If we add two-argument enqueue() we should also add two-argument write(). I would like the two-argument form to work even when not using type: transfer as it is extremely useful in combination with transferable streams. When I was thinking about this previously I was concerned about the extra complexity it adds to pipeTo(), but maybe it's really not that bad now that we use ReadRequest internally.

youennf commented 1 year ago

I would lean towards erroring the stream. Silently losing data is something I want to avoid.

Done.

If we add two-argument enqueue() we should also add two-argument write()

Yes, I plan to do a separate PR for WritableStream. And probably another for transform stream.

I would like the two-argument form to work even when not using type: transfer as it is extremely useful in combination with transferable streams.

I did not add this in this PR but it should be fairly easy to extend it as a follow-up. I am a bit hesitant to do this at this point:

youennf commented 1 year ago

I think the current version of the PR is ready for reference implementation prototyping. Should I update the reference implementation in this PR as well?

ricea commented 1 year ago

I did not add this in this PR but it should be fairly easy to extend it as a follow-up. I am a bit hesitant to do this at this point:

  • Adding a transfer list would trigger cloning of the data (observable in non transferable streams case)
  • ReadableStreamTee might fail with those transfer-only values.
  • Adding transfer:true explicitly makes it clear that the web page opts into that behavior. It is not a big adoption step, filling the transferList is anyway needed.

These are good points. Let's do it your way.

MattiasBuelens commented 1 year ago
  • We could introduce a second optional parameter to controller.enqueue, something like enqueue(any message, sequence<object> transfer = []) and update accordingly the StructuredTransferOrClone algorithm. This would make it closer to postMessage.

I like that a lot! 🙂 I understand that it's a bit out-of-scope for this PR, but I'd love to see that in a follow-up PR.

When we do add this, I would suggest we use an options object though, to align with postMessage(message, { transfer }). That makes it easier to add extra options for enqueue() and write() in the future.

ResetQueue would then transfer away ownership of all transfer lists still in the queue, without needing a separate [[isTransferring]] flag. In essence, type: "transfer" would merely force enqueue(chunk) to always be handled as enqueue(chunk, { transfer: [chunk] }).

I think the current version of the PR is ready for reference implementation prototyping. Should I update the reference implementation in this PR as well?

Yes, please. Each spec change on the Streams standard should be accompanied with new WPT tests, and the reference implementation should pass those new tests.

ricea commented 1 year ago

Should I update the reference implementation in this PR as well?

Yes, please do.

youennf commented 1 year ago

OK, I'll start doing it. Rethinking a bit, I wonder the following:

youennf commented 1 year ago

For testing, it seems we use node.js with WPT and the ref implementation. Are there any transferable object like MessagePort supported in node.js? Ditto for serializable, with say VideoFrame?

ricea commented 1 year ago

OK, I'll start doing it. Rethinking a bit, I wonder the following:

  • Would type:'serialize' be a better name than type:'transfer' (which conflicts a bit with transform streams, transferable streams and we sometimes serialize and not transfer chunks)?

This maybe needs a bit more bikeshedding. Another alternative I can think of is type: 'owning'.

  • Is the syntactic sugar for transferable objects (enqueue(videoFrame) instead of enqueue(videoFrame, [videoFrame])) a good idea now that we have the additional transfer parameter?

I really like the syntactic sugar personally, and I think being a different type of stream is a good enough excuse to use it. I'd like to get other people's input on this because there may be footguns I'm overlooking.

domenic commented 1 year ago

I really like the syntactic sugar personally, and I think being a different type of stream is a good enough excuse to use it. I'd like to get other people's input on this because there may be footguns I'm overlooking.

I would be most comfortable with requiring enqueue(videoFrame, { transfer: [videoFrame] }). That is the most like other web platform APIs that do transferring, such as structuredClone() and postMessage(). (Note: the { transfer: [x] } form is the modern version that replaces [x], so indeed, @MattiasBuelens's suggestion in https://github.com/whatwg/streams/pull/1271#issuecomment-1504867016 is a good one.)

However, I recognize it is verbose. So we might want to blaze a new path here and have enqueue() be the first sort of "auto-transfer" API. We should think carefully about how that works, and what pattern we might be setting up for future such APIs.

For testing, it seems we use node.js with WPT and the ref implementation. Are there any transferable object like MessagePort supported in node.js? Ditto for serializable, with say VideoFrame?

You might run up against the limits of the reference implementation here, for which I apologize. It appears that for transferrable streams https://github.com/whatwg/streams/pull/1053 we did not make any reference implementation changes and just skipped running the relevant tests against the reference implementation. But that was mostly fine because it didn't change the streams API itself. This change is more intrusive to the streams API so it would be a real shame to let the reference implementation get out of sync.

Your main tools for trying to tackle this are:

youennf commented 1 year ago

Another alternative I can think of is type: 'owning'.

I'll change to this, it is less confusing than transfer.

I would be most comfortable with requiring enqueue(videoFrame, { transfer: [videoFrame] }).

Let's start with this and let's do the shortcut as a follow-up.

  • ReadableStreamTee might fail with those transfer-only values.

ReadableStreamTee would error both branches according the current algorithm. Maybe we should relax this rule to only error the second branch. Let's also look at this in a follow-up.

youennf commented 1 year ago

PR is ready for review. If moving forward, this could be completed by the following tasks:

youennf commented 1 year ago

I think we need at least the pipeTo changes before we can land this, which means we need the WritableStream changes too.

I'll create another PR for WritableStream.

youennf commented 1 year ago

I think we need at least the pipeTo changes before we can land this, which means we need the WritableStream changes too.

https://github.com/whatwg/streams/pull/1272 adds WritableStream support. It includes guidelines to handle pipeTo read-write in case of the four isOwning cases. It also includes a draft reference implementation of pipeTo (not tested) that handles dropped chunks and the optimized ownership transfer in case of pipeTo between an owning ReadableStream and an owning WritableStream.

ricea commented 1 year ago

Sorry I didn't get to this today. I will do a proper review tomorrow.

youennf commented 1 year ago

Apologies, still missed a few undefineds in the reference implementation.

Ah thanks, I wanted to do that and forgot about it.

youennf commented 1 year ago

@ricea, any thoughts? Is this good to merge?

ricea commented 1 year ago

We need interest from one other browser besides Chrome. I should also review the tests before landing, but that's not strictly a blocker.

youennf commented 1 year ago

We need interest from one other browser besides Chrome.

There is WebKit interest. I think there is Mozilla interest from past discussions but the standard position issue is not active. @jan-ivar, can you comment here?

jan-ivar commented 1 year ago

The examples in the streams-for-raw-video-explainer.md seem outdated now. Can we update them? That would make wide review easier.

If I follow this PR right, we seem to have:

  new ReadableStream({
    type: "owning",
    pull(controller) {
      const frame = new VideoFrame(offscreenCanvas);
      controller.enqueue(frame, {transfer: [frame]});
    }
  });

Did I get that right?

youennf commented 1 year ago

The examples in the streams-for-raw-video-explainer.md seem outdated now. Can we update them?

Will do.

Did I get that right?

Yes. A follow-up may introduce a shorter syntax controller.enqueue(frame}); but this is out of scope for this first version.

youennf commented 1 year ago

Filed https://github.com/whatwg/streams/pull/1278 to update the explainer

youennf commented 1 year ago

I updated tests in https://github.com/web-platform-tests/wpt/pull/39520. Should we merge these WPT tests first so that I update this PR to use the WPT commit?

ricea commented 1 year ago

I updated tests in web-platform-tests/wpt#39520. Should we merge these WPT tests first so that I update this PR to use the WPT commit?

Yes, please.

MattiasBuelens commented 1 year ago

And now we're blocked on #1264 to get WPT working again... 😛

youennf commented 1 year ago

And now we're blocked on #1264 to get WPT working again... 😛

Ah, how hard is it to fix it? Is it fine to merge the PR for with spec/ref impl changes and split the test update to another PR?

youennf commented 1 year ago

PR is green and was approved. Is it ready for being merged?

youennf commented 1 year ago

LGTM, with a remaining question about Symbol.dispose.

Oh sorry, I missed the question. The plan is to support this at some point in a follow-up PR, we will concentrate on ReadableStream and others first.

youennf commented 1 year ago

Is it ready to merge?

ricea commented 1 year ago

Sorry for the radio silence. I'm trying to catch up on the latest changes, but I've been a bit swamped.

jan-ivar commented 6 months ago

Any updates on this? We're trying to solve https://github.com/w3c/webtransport/issues/507.

MattiasBuelens commented 5 months ago

@ricea Friendly ping. If you could find some time to review this, that'd be appreciated. 😉

jasnell commented 5 months ago

Overall I'm happy with the general idea but this feels rather heavyweight for the use case. Perhaps transfer can be built into the existing model without introducing a new stream "type".

For instance, ReadableStreamDefaultController can have a new method controller.clone(value, transferList) that is like controller.enqueue(...) but clones/transfers the value when enqueued.

MattiasBuelens commented 4 months ago

@jasnell That would imply that a stream can hold both "owned" and "unowned" chunks at the same time. This would mean a bunch more bookkeeping and passing around extra parameters:

All in all, I think it's easier to have a separate type for owning streams.

MattiasBuelens commented 4 months ago

Hmm, how does this interact with transferable streams?

Sure, we transfer in controller.enqueue() to push the chunk onto the stream's queue. But if the stream has been transferred, then:

To support this, I think we'll have to store the transfer list alongside each chunk in the queue, so we can re-transfer the chunk later on. 🤔

jasnell commented 4 months ago

@MattiasBuelens ... sorry I wasn't clear in my comment here https://github.com/whatwg/streams/pull/1271#issuecomment-2094296861 ... what I was suggesting is ONLY cloning/transferring at the point the chunk is enqueued, forgoing the rest of the proposed "owning" semantics.

saschanaz commented 4 months ago

Random thought about being heavyweight, what about:

let readable = new ReadableStream({
  start(controller) {
     // Wrap the chunk with a new interface (name chosen randomly)
     controller.enqueue(new Ownership(chunk));
  }
});
// ResetQueue checks whether the chunk is Ownership, and if yes,
// run dispose steps of the inner data of Ownership.
readable.cancel();
// Same for WritableStream, it would handle chunks that are Ownership.
readable.pipeTo(writable)
youennf commented 4 months ago

I also think a stream that only handles owned chunks is simpler to reason about and as powerful. Some cases for individually owned chunks can be covered with structuredClone though not for disposing in case of cancel.

saschanaz commented 4 months ago

I think having a separate representation of ownership instead of changing the internal of streams would be easier; when you pipe you pass the ownership without having to worry about whether the destination is owning stream or not.

(But doing so would have bigger effect in the web platform...)