WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.05k stars 167 forks source link

[audioworklet] AudioWorkletProcessor ondata should just be a callback rather than an EventHandler #1194

Closed bfgeek closed 7 years ago

bfgeek commented 7 years ago

I believe the intention here was to specifically deliver any ondata's sent from the main thread directly before the process method is called for that node?

If so AudioWorkletProcessor shouldn't have ondata being a EventHandler just a callback similar to process.

hoch commented 7 years ago

Yeah, these were loose ends that we'll have to figure out. I assume its interface should be loosely defined by a prose, not in a form of IDL?

How about sendData()? Perhaps we can start based on postMessage() and subtract things that we don't need?

annevk commented 7 years ago

@domenic should probably be involved here to some extent if you're going to redesign postMessage(). What exactly are the goals of this new method?

hoch commented 7 years ago

@domenic @annevk

There two classes: AudioWorkletNode (main thread) and AudioWorkletProcessor (worklet thread). When user creates a AWN object, an instance of AWP is also created in the worklet global scope. They are tightly coupled from the construction and communicate each other using sendData() and ondata.

If we can send serialized data between them, that's good enough for our purpose. So making ondata as a UA-invoked callback makes sense because we don't need the whole nine yard of Event class. Also the destination of sendData() is defined at the construction of AWN/AWP pair, and it cannot be changed.

Please let me know if you need more clarification!

annevk commented 7 years ago

I see, the goal is not exposing EventTarget and friends in worklets. Okay.

  1. Let's not call the callbacks something that starts with "on". Within the platform all of those are event handlers. We should stay true to that.
  2. Are these worklets always same-origin? Or do we need a security story?
  3. If they are same-origin and we don't need a security story this should indeed be pretty easy.
hoch commented 7 years ago

By 2, do you mean by "same-origin" that the worklet script from a different origin cannot be used? Hmm. That actually sounds bad for our use cases. Can you explain what kind of 'security story' we need?

annevk commented 7 years ago

@hoch I guess it depends on what the origin in charge is of that worklet script. I noticed in the worklet specification that the worklets are loaded with CORS, so I guess they end up being CORS-same-origin in which case we probably don't need anything special. Is that correct?

hoch commented 7 years ago

That's correct and I missed that. Just to be sure, I am including our spec editors here: @rtoy and @padenot. I believe @bfgeek also has some thoughts on this.

annevk commented 7 years ago

Note also that StructuredDeserialize can fail. You need to consider that and either use a different error callback or some other way to signal failure on the receiver end. See also https://github.com/whatwg/html/issues/936.

annevk commented 7 years ago

cc @bakulf

hoch commented 7 years ago

Okay, here's my conclusion from this thread:

  1. Change EventHandler ondata to processordata of void callback. This callback won't be a part of WebIDL. Its behavior will be described in a prose just like what we do with AudioWorkletProcessor's process().
  2. For V1, the script form the different origin is not allowed.
  3. For the failure of StructuredDeserialize, I think we should channel the exception to the main scope. The same thing also should happen when process() call fails due to the exception.
domenic commented 7 years ago

For the failure of StructuredDeserialize, I think we should channel the exception to the main scope. The same thing also should happen when process() call fails due to the exception.

It's possible it makes sense to do this for web audio; I am not sure. But I just wanted to note that for HTML and postMessage(), we will not tell the originator about deserialization failures, but instead we will tell the target that they were supposed to be getting a message, but did not. (Via a new "messageerror" event.)

hoch commented 7 years ago

@domenic So you mean we want to signal the error to AudioWorkletGlobalScope, not the main scope. I believe that's logically correct, but how can developers see/monitor the error message in that case?

domenic commented 7 years ago

Using self.addEventListener("messageerror", e => ...) for example

(Or using the developer tools, which should show all such errors.)

padenot commented 7 years ago

For V1, the script form the different origin is not allowed.

Why would that be ?

annevk commented 7 years ago

As long as there is CORS, which there is per the worklet loading pipeline, it should be fine.

hoch commented 7 years ago

In that case, we can simply refer what the worklet does. We don't want to repeat how CORS works on the worklet in the WebAudio spec.

hoch commented 7 years ago

PTAL @rtoy, @padenot and @joeberkovitz:

First, we need to take ondata handler out of the IDL.

interface AudioWorkletNode : AudioNode {
  readonly attribute AudioParamMap parameters;
  void sendData(any data);
}

interface AudioWorkletProcessor {
  readonly attribute AudioContextInfo contextInfo;
  void sendData(any data);
}

Secondly, we need a prose that describes how processordata or nodedata works.

I guess this also covers: https://github.com/WebAudio/web-audio-api/issues/1193.

hoch commented 7 years ago

self.addEventListener("messageerror", e => ...)

@domenic @bfgeek WorkletGlobalScope does not inherit EventTarget, thus we don't have such functionality. Perhaps we should consider adding something like this to the global scope?

domenic commented 7 years ago

I see, that is a bit of a blocker. I am curious what @bfgeek thinks then. It seems better to have such events than to have each type of worklet have its own bespoke way of receiving errors from the main thread.

annevk commented 7 years ago

Another question, @bfgeek suggested worklets should not support shared memory, but @flagxor said on blink-dev audio worklets should. So which is it?

flagxor commented 7 years ago

For audio worklets in particular, access to shared memory seems key to being able to allow low latency audio from game engines, audio codecs, or speech synthesizers (just spoke to folks that do that one today). Currently engines like Unity use different audio on the web than all other platforms, because engines like Wwise can't be supported with WebAudio's event loop driven style.

A typical pattern would likely be to use a ShardArrayBuffer + audio worket + a web worker running wasm code to mirror something like what I understand AudioWorkers to have originally looked like. Where JS running on an audio thread would have danger of not being able to keep up with hard deadlines, Wasm code can avoid GC pauses, and presumably deliver reliable audio rates. The first non-main thread thing we ended up needing for NaCl was audio, so I rather expect to need to agitate for something to fill the API gap (and am hoping audio worklets will fill that need).

I believe blocking operations like Atomics.wait can be excluded similar to on the main ui thread, as typically these kinds of applications will just want to keep a lock free queue full.

I understand you guys want to avoid a pattern where audio data is posted, With SABs, there would be a single post, and then audio data is read directly from the buffer, so should be fast assuming the sender can keep up.

My general sense is that as more of the Web API surface is hardened to handle shared memory, more surfaces will make sense in general to support SharedArrayBuffer. But audio in particular is an area where Unity and others have raised concern the current solution isn't performant (as with current WebAudio you simply can't both generate your own audio and know you won't skip).

By the way, I had spoke to @rtoy a bit back about this and had the impression you guys were approaching v1 completeness, so I certainly didn't mean to derail that. Rather, I'm suggesting this is likely to be something folks will want to try this year, so worth not planning out in some general way.

annevk commented 7 years ago

I'd prefer we write it down now as we think it should behave going forward. In particular I'd rather not revise the HTML Standard several times on this. (The HTML Standard is in "charge" of making sure Agent / Agent Cluster actually have grounding in browsers.)

(Aside: please don't use guys as a gender-neutral term; folks works if I can make a suggestion.)

padenot commented 7 years ago

Audio Worklets will have to support shared memory at some point to be competitive, that's for sure, for doing something like:

However, simply having transferables (for now), until a couple vendors ship an implementation, while making sure there is nothing blocking worklets to use shared memory in the future (since it's opt-in) seems like a reasonable path to take.

Transferables, while generating some GC pressure, avoid copies (doing what is essentially a pointer swap), and the event loop is used as a synchronization mechanism, this can go a long way.

annevk commented 7 years ago

Shared memory is actually not as much opt-in specification-wise. If your variant of postMessage() is equally generic and we make worklets part of the agent cluster of their owner (always a Window/document as I understand it) you'd just have it.

padenot commented 7 years ago

I've been told that API need to opt-in to be able to receive a SharedArrayBuffer, which is the reason why the Web Audio API don't use them, because we haven't prioritized speccing/implementing it.

I can't really find an authoritative source at the moment, but this line has been authored by the person who did SharedArrayBuffer.

If this is not the case, then we don't have much to do.

annevk commented 7 years ago

It's opt-in unless you use any (in IDL), I think, but the specifics aren't settled just yet.

hoch commented 7 years ago

We did not spend much time to discuss the feasibility of SAB along with AudioWorklet or its previous iteration. Partially because it was not ready for other APIs to pick up at that moment. The use case suggested by @padenot seems compelling, but it might be difficult change what we currently have to accommodate the SAB at this point.

Let alone SAB, the AudioWorklet does not support Transferrable with the current design of sendData(). We intentionally minimized the functionality of method to keep thing simple but good enough to pass control data, not the audio sample data.

FWIW, sendData() takes the argument of any so it is opt-out of SAB.

annevk commented 7 years ago

Transferables seems needed if you want to support the use cases from @padenot. Transferables are not needed for SAB however.

What I said is that if you use any you have opted in. If you don't use any you have to opt-in explicitly through [AllowShared], but that isn't finalized yet.

hoch commented 7 years ago

Oh, I got it backward. Well, then we have to use something else than any for sendData().

I am against adopting Transferable for the V1 AudioWorklet. Introducing SAB in the next iteration seems rather reasonable.

annevk commented 7 years ago

What makes it so hard?

Also, is it really that bad to support EventTarget, Event, and MessageEvent in (audio) worklets? Then this could just reuse the postMessage() API.

hoch commented 7 years ago

@bfgeek and I thought about the path.

For one thing, the structure and the layer are completely different. postMessage() operates on the global scope level. So it has to belong to AudioWorkletGlobalScope not the individual instance of AudioWorkletProcessor/AudioWorkletNode if we have to introduce that. Unlike postMessage(), sendData() specifically targets the associated instance of a processor or a node.

Also it works on a different scale. AWP/AWN are supposed to be instantiated just like an AudioNode (e.g. 10~1000 instances concurrently). Making them support the full "Event and the family" will cause more overhead. Imagine that 100 processors firing events for every 2~3ms.

bfgeek commented 7 years ago

Sorry for the slow replies was on a plane for most of yesterday.

@domenic

I see, that is a bit of a blocker. I am curious what @bfgeek thinks then. It seems better to have such events than to have each type of worklet have its own bespoke way of receiving errors from the main thread.

Ah I wasn't aware that this could fail...

So I don't think this actually wants an event on the globalscope in the case of worklets, instead it makes sense to have a callback (or event handler); e.g.

class AudioNode {

  ondata(data) {
    this.data = data;
  }

  onmessageerror() { // ondataerror instead?
    this.data = null;
  }
}

This primary reason being that the error is local to the node it was performed on, and not relevant to the globalscope like it is for workers.

I'm also hesitant to allowing a generic event handlers on the global scope, as we'd need to add self which breaks the "really try not to let people store state on the global"(tm) principle. :)

Does that make sense? @hoch thoughts?

domenic commented 7 years ago

That makes sense to me. It's a pretty different programming model than workers, but it seems internally consistent, and the motivation of trying to make it per-node and avoid global state in every way is a good one.

bfgeek commented 7 years ago

@annevk re: SABs. This discussion was clarifying, initially I was only primarily thinking about the Houdini specs, and Audio wasn't top of mind.

So I think it makes sense to allow SABs for worklets initially, (this means that worklets are in the same group(?) as window and workers?). Allowing them for all worklets should be fine as we won't be providing any thread-to-thread communciation channel for the css-{layout,paint,animation}-apis. I.e. there isn't any "sendData" analog in those specs, everything is controlled by the engine.

Punting on transferrables still seems reasonable still i think in this specific case?

@hoch One thing that we should be clear about in the spec is not allowing futexes on the audio thread, I'm assuming this would be "bad thing"? :P.

bfgeek commented 7 years ago

Finally, after thinking about this a little, I'm personally 60/40 on if this should be callback or an eventhandler.

I think a simple callback is slightly nicer than having an event attached to an object? I.e. ondata(event) { this.data = event.data; } vs. ondata(data) { this.data = data; } especially if we don't think that any of the other fields will be used on the event.

My personal default model for worklets has been use callbacks by default, but really curious what others think there.

Also as a historical node I think this feature was initially called postMessage but morphed to sendData as didn't support transferrables, and had instance-to-instance semantics.

domenic commented 7 years ago

Note that if you're talking about a real event handler, the syntax ondata(event) { ... } doesn't work, as that does a DefineProperty, not a Set, so it doesn't trigger the setter. Instead you need to do this.ondata = ... in the constructor.

bfgeek commented 7 years ago

Ah thanks.

annevk commented 7 years ago

Couple notes:

  1. postMessage() isn't necessarily global. With MessageChannel it's very much locally-bound. I'm personally okay with not reusing it or events though, but it might be a tad of a learning curve if developers have to learn all kinds of new APIs just for worklets. It's not immediately clear to me it's that much more expensive to support events. It will also be more maintenance for everyone involved. Each novel variant of serializing and deserializing adds cost.
  2. I'm okay with not doing transferables, I was just noting it seems that @padenot is very much expecting them to be there.
  3. SAB: worklets would be their own Agent and in the same Agent Cluster as their parent document/Window (and its associated Windows) and any associated dedicated workers. The other thing here is that we'd make [[CanBlock]] false for the Worklet Agent per comments from @flagxor above.
domenic commented 7 years ago

Random idea, haven't thought it through: could each worklet class get a MessagePort instance? One on the "inside" (via the base class presumably), and one on the outside (via the worklet class)?

flagxor commented 7 years ago

Thanks all for talking thru SAB use case + impact.

FWIW, coming to this particular topic without all the context, here's one more thing to consider with Events vs simple callbacks. I've heard concern expressed that for audio worklets, in practice, anything that could cause a garbage collect has the potential to pop the audio stream (as collects could delay timely audio delivery). I assume more extensive measures would be required to avoid generating garbage for each Event?

With general JS running in the worklet, there'd be other things that might trigger GC, but Wasm at least has the potential to avoid it if care is applied, assuming the worklet plumbing / event loop itself doesn't generate garbage.

(Aside: please don't use guys as a gender-neutral term; folks works if I can make a suggestion.)

Yes indeed, quite so.

hoch commented 7 years ago

@bfgeek Yes, @domenic pointed out a while ago that we'd have to use callback not the event handler. I couldn't find that comment now, but I forgot to fix our IDL to reflect the discussion.

@annevk Since AudioWorklet is the first to implement the instance-to-instance messaging mechanism, I understand that you want to explore every option possible. However, the constraints we have is very different from other Worker or Worklet variants. The audio rendering thread will run JS VM so the GC triggered by user-supplied code will stop the entire audio rendering operation, which will eventually cause the glitch. (At least this is unavoidable in Chrome) If exposing Event/EventTarget/postMessage increases the chance of GC in that thread, then it already defeats the purpose of the AudioWorklet project. It needs to provide the simplest way of sending message, and that's why @bfgeek and I came up with sendData().

smaug---- commented 7 years ago

GC may be triggered anyhow, whether or not Events are used.

If you really want to optimize out possible GCs, the passed data must not be any new object, but some mutating array containing raw data or something, something which would change any time new data is received. If that is not done, I don't quite see the argument to not use Events.

(In Gecko the wrappers for Events are allocated from nursery heap to try to avoid major GCs. Minor GC should be fast.)

annevk commented 7 years ago

Since AudioWorklet is the first to implement the instance-to-instance messaging mechanism

Just to reiterate, MessageChannel already does this.

And yeah, if you really want to avoid GC, you should have just a SharedArrayBuffer object and some kind of signal for when it's updated (per @wanderview).

hoch commented 7 years ago

Perhaps I have not been clear about this, but the primary purpose of sendData() and ondata is not about sharing data between two threads. It is about being able to send a light-weight "control message" to the processor.

Having two different layers for the different rate of signal (a-rate and k-rate) has been a successful pattern for computer music programming. (e.g. SuperCollider or Max/MSP) From my perspective, sendData() is designed to handle the k-rate data. The control (k) data is commonly very small and ephemeral; it is used and gets thrown away. For example,

// From the main global scope to the associated processor.
myNode.sendData({ volume: 0.6, type: 'noise' });

Supposedly you can send whatever you want (and it'll be serialized at the receiving end), but sending thousands of audio samples over the thread is definitely not the use case here. If it were our intention to build this API, we would have chosen SAB or Transferable for sure.

It might makes sense to have the more generalized messaging mechanism across the web platform, but I would really like to argue that the light-weight messaging is super important for audio.

@padenot @rtoy @joeberkovitz WDYT?

smaug---- commented 7 years ago

myNode.sendData({ volume: 0.6, type: 'noise' }); already creates garbage, not much different to postMessage(). Are you saying the object created by sendData is somehow ok, but the Event object created by postMessage isn't?

Sure, the data with postMessage would be event.data

domenic commented 7 years ago

Yes, if the argument is about lightweightness or GC, I don't see any reason to avoid postMessage/MessageChannels.

If you need to avoid GCs, then as @annevk said, you should be using shared memory. If you just want to have small lightweight messages, postMessage/MessageChannels are fine for that purpose; the additional Event object is not a significant overhead.

hoch commented 7 years ago

I would like to take stab at using "the implicit MessageChannel" for AudioWorkletNode and AudioWorkletProcessor. @domenic suggested the worker interface is already using this pattern, and I believe this could work for the AudioWorklet case.

// From the main scope.
myNode.port.postMessage(...);

// In AudioWorkletGlobalScope
class extends AudioWorkletProcessor {
  constructor() {
    this.port.onmessage = () => { ... };
  }
  ...
}

@domenic also suggested to expose port as a member of the class instead of extends MessagePort to processor/node. I think it has to be this way, because we don't want start()/stop() methods to leak to the node or the processor. We expect developers to use start()/stop() method for their own custom node.

Two issues on my mind:

  1. We cannot define onmessage handler as a part of the class definition. this.onmessage must be defined in the constructor. Not ideal, but it's not the end of the world. It also seems like there is no workaround for this.
  2. postMessage() allows Transferable by definition. Not sure what other WG members think about this.
rtoy commented 7 years ago

I believe we very much do want to avoid GC. Any garbage generated on the JS instance for the WebAudio thread will eventually cause the thread to pause to do GC, glitching the audio. While we can't do anything about user code, I think we really need to design it so that the messages/callbacks/whatever don't themselves cause garbage to be generated.

And since WebAudio supports sample rates of 96 kHz (Chrome supports 384 kHz), any pause that is a significant fraction of 1.33 ms (128/96000) (0.33 ms for Chrome) will cause a glitch.

On Thu, Apr 13, 2017 at 10:58 AM, Domenic Denicola <notifications@github.com

wrote:

Yes, if the argument is about lightweightness or GC, I don't see any reason to avoid postMessage/MessageChannels.

If you need to avoid GCs, then as @annevk https://github.com/annevk said, you should be using shared memory. If you just want to have small lightweight messages, postMessage/MessageChannels are fine for that purpose; the additional Event object is not a significant overhead.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WebAudio/web-audio-api/issues/1194#issuecomment-293975731, or mute the thread https://github.com/notifications/unsubscribe-auth/AAofPE_GYwPidH9LS6-_CKj1UDdBL0frks5rvmJUgaJpZM4M0x-l .

-- Ray

padenot commented 7 years ago

tl;dr, we should use MessagePort and postMessage.

Long version:

Big GC pauses are to be avoided, so being able to use Transferables or shared memory is a must. For control messages, the controls objects in @hoch examples should be short lived or very long lived (i.e. reused across callback), and the GC pauses should be minimal (considering the techniques implemented in modern browsers).

In any case, minimizing the GC churn is a worthwhile goal (and not only for audio, webgl, webvr, and the new vulkan-like API that's shaping up have those constraints, although it's a bit less critical for anything but audio, considering the latency we're dealing with, here).

If we really want to pick something that has good performance characteristics for our use case, we should be doing some hands-on analysis to quantify the impact of various APIs that require creating js objects.

For example, we should measure, for a single IPC transmission of an empty object:

For all those absolute metrics, we should compute their standard deviation (because we're interested in the worst case, the Web Audio API being a real-time system). Then we can compare APIs styles (normal MessagePort + postMessage, a custom sendData, etc.) against each other, and decide.

However, my guess is that any communication API that exists today on the web platform allocates memory and causes GC churn, and we need to pick an API allows minimizing this.

Going with our custom API might make some of this easier, and might even allow GC-less processing on the worklet side, but is probably more complicated, and it's unclear that it would make things faster in the long run.

Indeed, allowing SharedArrayBuffer, and communicating via the usual lock-free mechanisms is an option that solves all this, but is more involved for authors. However, we're dealing with a spectrum of authors, with different needs and knowledge. Being able to use a slower, easier method of communication (postMessage with transferables) is not in opposition with doing things the "right" way (SharedArrayBuffer + lock free algorithm, where you'd use postMessage once to set things up and then build a custom and fast communication side-channel).

In conclusion, I like @hoch's and @domenic view, summarized in https://github.com/WebAudio/web-audio-api/issues/1194#issuecomment-293986023, for the following reasons:

padenot commented 7 years ago

And since WebAudio supports sample rates of 96 kHz (Chrome supports 384 kHz), any pause that is a significant fraction of 1.33 ms (128/96000) (0.33 ms for Chrome) will cause a glitch.

It is technically true that on some OSes you can open an audio stream with a buffer size of 128 frames (or less, but Web Audio API has a fixed block size of 128 frames), at a very high sample-rate, which would mean having a callback each 0.33ms in the worst case (this is not specific to Chrome, it mostly depends on the capability of the sound card and the OS and the system configuration). Depending on the OS, you then have a buffer queue, or buffer ping-ponging.

However, in practice, the measured loopback latency of a modern browser is around 30ms on OSX (this is from a paper that is probably going to be published at the next Web Audio Conference, it's not public yet. This figure is on OSX, measures the input->output latency, and is true for Firefox and Chrome). This is because browsers are using conservative buffer sizes to allow for higher callback load (i.e., being able to do more expensive processing without glitching) and to work around device-specific bugs, and also because modern browsers are using a multi-process and multi-thread architecture, bouncing the audio between multiple threads and processes, that might or might not be high priority/real-time thread.

This allows for some slack, but I clearly agree we should design this API in a way that we does not prevent optimizing things in the future.