Closed youennf closed 3 years ago
Can you elaborate on why we should specify an explicit threading model? I think different implementations may use different models, as long as enqueuing respects the user requirements, if any.
The threading model, like for other APIs that process realtime data chunks (say web audio or web codecs), is important. This makes sure that important threading decisions are correctly discussed, described and shared with all spec users (implementors, web devs). This allows implementors to get it right more easily and reduces potential interop issues.
For instance, it is unclear where maxBufferSize check is done, which can alter the API in various ways depending on the implementation. Similarly, this API currently relies on transferring WhatWG streams to workers without being blocked on the thread where streams are created. This is not straightforward from reading the spec and easy to fix in practice.
All these implementation details are core to achieve behavior and efficiency. So core that it is important to have consensus on them and describe them properly. Web audio and web codecs are examples of spec that describe the model.
All these implementation details are core to achieve behavior and efficiency. So core that it is important to have consensus on them and describe them properly. Web audio and web codecs are examples of spec that describe the model.
https://webaudio.github.io/web-audio-api/#rendering-loop for Web Audio , and https://w3c.github.io/webcodecs/#dom-audiodecoder-decode for Web Codecs (only the audio decoding side on this link, there are lots of other examples).
This is the bare minimum to do if this is going to ship anywhere.
All these implementation details are core to achieve behavior and efficiency. So core that it is important to have consensus on them and describe them properly. Web audio and web codecs are examples of spec that describe the model.
https://webaudio.github.io/web-audio-api/#rendering-loop for Web Audio , and https://w3c.github.io/webcodecs/#dom-audiodecoder-decode for Web Codecs (only the audio decoding side on this link, there are lots of other examples).
This is the bare minimum to do if this is going to ship anywhere.
Can you elaborate on why that is the bare minimum if this is going to ship anywhere?
I'm not sure if I agree or disagree with you, but my point is that I don't see the need to hardcode a threading model with specific threads. In practice implementations might not necessarily follow it. In Chrome, for example, there are several processes involved in the capture (do we need a process model as well?) and there are different ways to set up threading in the process running JS to achieve the same results. There might be some performance differences depending on the approach, but they would be correct. For example, we could implement it by saying we have a "media" thread where frames are delivered and explain the interactions between this thread and the worker thread. But in Chrome it would also be feasible with some work to set the worker thread as the thread where frames are delivered to a MediaStreamTrackProcessor. Would a threading model prevent one of these approaches? For a MediaStreamTrackGenerator, does it really matter to which thread the frames are sent once they are written? In many cases it will depend on which sinks are connected to the generator.
Can you elaborate on why that is the bare minimum if this is going to ship anywhere?
It is not possible to write a compatible implementation if the respective order of operations is not specified normatively. The links above don't mandate any thread setup, they mandate a specific order in which thing need to appear to the user, using a thread setup to explain it.
Web Codecs uses various threads internally (and even hardware decoders and even other chips, or even networked attached stuff, it is irrelevant), but is specified in a certain way. Web Audio also offloads a bunch of stuff to other threads and even ping-pongs between threads in various scenario, but is specified using only a single thread as you see.
You might think that you can just describe things roughly, but that won't work, this has real-world consequences. Things need to be described in a way that more than one implementation can be written without looking at each other source code, and pass the same tests.
Things matter if they are observable, in which case they need to be specified.
So what you mean is ordering of operations and not a specific threading model. I agree with that. Looking forward to the specific ordering of operation bugs that we need to address in order for this to ship anywhere.
I still see no need for a threading model defining specific threads.
The claim that we need a threading model because Web Audio and Web Codecs define one, or this can't ship anywhere is just a faulty generalization fallacy. There isn't a single MediaStreamTrack sink spec defines a threading model, and they all have shipped. The closest would be Web Audio, which does define a threading model, but it does not specify how the Web Audio MediaStreamTrack sinks (MediaStreamAudioSourceNode and MediaStreamAudioTrackSourceNode) and source (MediaStreamAudioDestinationNode) interact with it (at least not explicitly), much like a lot of the other observable behavior of those objects which is also not specified (yet it shipped!)
I guess we could put a note somewhere saying that the model is that all operations/algorithms must execute sequentially and that multithreaded implementations must be careful to ensure this property (e.g., to avoid races), but I don't see a strong need for it since that seems to be the default for a lot of specs and they usually don't have this kind of note.
For the reasons given above, I'm closing this bug. Now that the spec is using more specific algorithms to define behaviors, please file separate issues if you find ordering problems in them.
In particular, it is not clear how enqueuing of frames is actually happening. The frames are most probably generated in a media thread. But they are consumed in the ReadableStream thread. Spec should clarify how this actually happens. That will allow to understand where maxBufferSize checks happen and what is the impact of blocking the ReadableStream thread.