WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.04k stars 165 forks source link

Audio Workers #113

Closed olivierthereaux closed 7 years ago

olivierthereaux commented 10 years ago

Originally reported on W3C Bugzilla ISSUE-17415 Tue, 05 Jun 2012 12:43:20 GMT Reported by Michael[tm] Smith Assigned to

Audio-ISSUE-107 (JSWorkers): JavaScriptAudioNode processing in workers [Web Audio API]

http://www.w3.org/2011/audio/track/issues/107

Raised by: Marcus Geelnard On product: Web Audio API

https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#JavaScriptAudioNode

It has been discussed before (see [1] and [2], for instance), but I could not find an issue for it, so here goes:

The JavaScriptAudioNode should do its processing in a separate context (e.g. a worker) rather than in the main thread/context. It could potentially mean very low overhead for JavaScript-based audio processing, and seems to be a fundamental requirement for making the JavaScriptAudioNode really useful.

[1] http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0225.html [2] http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0245.html

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Thu, 26 Jul 2012 15:05:30 GMT

(In reply to comment #50)

Grant, it seems to me that there are at least two options for main-thread audio generation even if there's no JavaScriptAudioNode.

  1. Generate your audio into AudioBuffers and schedule these to play back-to-back with AudioBufferSoruceNodes. (I haven't tried if the WebKit implementation handles this gapless, but I don't see why we shouldn't support this in the spec.)
  2. Generate your audio into AudioBuffers and postMessage these to a WorkerAudioNode. If ownership of the buffer is transferred it should be cheap and there's no reason why this should incur a large delay, particularly not half a second like you've seen. That sounds like a browser bug to be fixed.

In both cases one will have one new object per buffer to GC, in the first case it's a AudioBufferSourceNode and in the second case it's the event object on the worker side.

Option 2 is not viable, due to I/O lag between webworkers and the main thread. I tried webworker audio gen with the MediaStreamProcessing API (Experimental API by roc, he even had builds for it) and sent buffers from main->worker and latency was around 1/3 a second or more.

Option 1 does not make the situation for gapless audio any better here. We're just making it harder to push out audio. The browser knows best when to fire audio refills. Forcing the JS code to schedule audio will make audio buffering and drop outs worse.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Thu, 26 Jul 2012 15:09:36 GMT

There's always some lag with comm to a webworker, and losing the direct ability to be called for refill on the same thread for audio makes the entire web app vulnerable to lag spikes in webworker communication. I saw this with the MediaStream API, where sync backs would be bunched up too much causing too much inconsistency and thus gaps.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Thu, 26 Jul 2012 21:43:56 GMT

(In reply to comment #51)

Option 1 does not make the situation for gapless audio any better here. We're just making it harder to push out audio. The browser knows best when to fire audio refills. Forcing the JS code to schedule audio will make audio buffering and drop outs worse.

I don't follow this. You're already using some JS scheduling to drive the progress of the emulator (requestAnimationFrame? setInterval?). Each step of the emulator generates some audio samples that you want to play back as soon as possible. So stuffing those samples into an output pipe somehow should be good for you.

Or are you actually trying to drive the progress of the emulator off the audio clock somehow?

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Thu, 26 Jul 2012 23:42:29 GMT

(In reply to comment #53)

(In reply to comment #51)

Option 1 does not make the situation for gapless audio any better here. We're just making it harder to push out audio. The browser knows best when to fire audio refills. Forcing the JS code to schedule audio will make audio buffering and drop outs worse.

I don't follow this. You're already using some JS scheduling to drive the progress of the emulator (requestAnimationFrame? setInterval?). Each step of the emulator generates some audio samples that you want to play back as soon as possible. So stuffing those samples into an output pipe somehow should be good for you.

Or are you actually trying to drive the progress of the emulator off the audio clock somehow?

I drive off both. I use the audio clock and setInterval. If I used setInterval, almost every browser wouldn't run the code enough, so the right thing to do is to drive with the audio clock in addition to the setInterval.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Thu, 26 Jul 2012 23:51:07 GMT

What frequency do you need to run the code at? With setInterval you should be able to get 4ms.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 03:04:44 GMT

(In reply to comment #55)

What frequency do you need to run the code at? With setInterval you should be able to get 4ms.

I use a ms setInterval as the base driver. The rate of the setInterval is not the problem. The problem lies with the fact that setinterval was designed, rightfully so, to skip a call if there was a buildup of delays. Audio gen by this method should never be conducted without audio clok heurisitics as a result of this. Proper audio gen should always check to see the respective buffering levels, as a system can skip a setInterval callback due to unexpected latency. Also some browsers, like google chrome, have A LOT of jitter in their setInterval and skip an unnecessarily large amout of setInterval calls. If I disabled audio clock input, we would be hearing a lot of crackles and pops across the board in every browser if we simply just timed by setInterval.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 03:06:50 GMT

*I use a 4 ms timer already, in addition to the less often occurring web audio callbacks for 2048 samples frames (this is also the min frames per callback adobe flash uses).

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Fri, 27 Jul 2012 04:02:30 GMT

Is the writing part of the Audio Data API --- mozSetup, mozWriteAudio, mozCurrentSampleOffset --- what you really want?

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Fri, 27 Jul 2012 06:42:06 GMT

(In reply to comment #51)

Option 1 does not make the situation for gapless audio any better here. We're just making it harder to push out audio. The browser knows best when to fire audio refills. Forcing the JS code to schedule audio will make audio buffering and drop outs worse.

It seems to me that you're not really interested in doing audio processing in the audio callback (which is what it was designed for). Am I right in assuming that you're looking for some kind of combination of an audio data push mechanism and a reliable event mechanism for guaranteeing that you push often enough?

AFAICT, the noteOn & AudioParam interfaces were designed for making it possible to schedule sample accurate audio actions ahead of time. I think that it should be possible to use it for providing gap-less audio playback (typically using a few AudioBuffers in a multi-buffering manner and scheduling them with AudioBufferSourceNodes). The problem, as it seems, is that you need to accommodate for possible jittering and event drops, possibly by introducing a latency (e.g, would it work if you forced a latency of 0.5s?).

Would the following be a correct conclusion?:

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Fri, 27 Jul 2012 06:50:44 GMT

  • We need a reliable main-context event system for scheduling audio actions (setInterval is not up to it, it seems).

There can't be any truly reliable way to schedule audio actions on the main thread.

I doubt you can do better than setInterval plus some way to measure audio progress, such as mozCurrentSampleOffset.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Fri, 27 Jul 2012 07:54:13 GMT

(In reply to comment #52)

There's always some lag with comm to a webworker, and losing the direct ability to be called for refill on the same thread for audio makes the entire web app vulnerable to lag spikes in webworker communication. I saw this with the MediaStream API, where sync backs would be bunched up too much causing too much inconsistency and thus gaps.

I haven't looked into the worker communication lag part, but I don't really see a fundamental reason why it has to be a problem.

On the other hand, doing audio callbacks on the main thread will always be problematic, and I don't see how it could ever be solved. I envision a design where the audio mixer is running in a separate thread (or possibly multiple threads for graphs with independent processing chains). Whenever a node is encountered that needs data from a main thread callback, the mixing thread will have to halt and wait until the main thread callback has fired and finished, which is very sensitive to other things that may be going on in the main thread (e.g. a setInterval/requestAnimationFrame event).

I'd much more prefer a solution that does not require the audio mixing thread to be halted just to fire an event on the main thread.

olivierthereaux commented 10 years ago

Original comment by Olivier Thereaux on W3C Bugzilla. Fri, 27 Jul 2012 11:14:16 GMT

(In reply to comment #59)

Would the following be a correct conclusion?:

  • Audio processing in JavaScript should be done in workers. […]

I think that is a reasonable conclusion, yes - although there may be objections by e.g Philip who seemed to be quite adamant about it (See Comment #2).

Any more suggestions on how this could be made obvious to developers (as you mentioned in Comment #4)? Would it suffice to have the default interface be constructed in the way suggested by Chris in http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0225.html - and requiring a conscious choice not to use a worker?

olivierthereaux commented 10 years ago

Original comment by Olli Pettay on W3C Bugzilla. Fri, 27 Jul 2012 11:22:41 GMT

(In reply to comment #62)

That tends to not work. If some API is available, whether it is bad or not, it will be used. (sync XHR in the main thread is a good example)

olivierthereaux commented 10 years ago

Original comment by Philip Jägenstedt on W3C Bugzilla. Fri, 27 Jul 2012 13:44:45 GMT

(In reply to comment #62)

(In reply to comment #59)

Would the following be a correct conclusion?:

  • Audio processing in JavaScript should be done in workers. […]

I think that is a reasonable conclusion, yes - although there may be objections by e.g Philip who seemed to be quite adamant about it (See Comment #2).

Nope, I think that JavaScript processing should be done in workers, and only workers.

olivierthereaux commented 10 years ago

Original comment by Olivier Thereaux on W3C Bugzilla. Fri, 27 Jul 2012 13:57:16 GMT

(In reply to comment #64)

(In reply to comment #59)

Would the following be a correct conclusion?:

  • Audio processing in JavaScript should be done in workers. […]

I think that is a reasonable conclusion, yes - although there may be objections by e.g Philip who seemed to be quite adamant about it (See Comment #2).

Nope, I think that JavaScript processing should be done in workers, and only workers.

I might have misunderstood Marcus' point then.

Marcus, was your conclusion that it should be done in workers (and only there, hence a MUST) or that it should be done in workers (and could be done in mail thread at your own risks)?

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Fri, 27 Jul 2012 14:02:51 GMT

(In reply to comment #65)

(In reply to comment #64)

(In reply to comment #59)

Would the following be a correct conclusion?:

  • Audio processing in JavaScript should be done in workers. […]

I think that is a reasonable conclusion, yes - although there may be objections by e.g Philip who seemed to be quite adamant about it (See Comment #2).

Nope, I think that JavaScript processing should be done in workers, and only workers.

I might have misunderstood Marcus' point then.

Marcus, was your conclusion that it should be done in workers (and only there, hence a MUST) or that it should be done in workers (and could be done in mail thread at your own risks)?

Sorry for the vague language. What I meant was that JavaScript processing MUST be done in workers, and only in workers, never in the main thread.

olivierthereaux commented 10 years ago

Original comment by Olivier Thereaux on W3C Bugzilla. Fri, 27 Jul 2012 15:57:37 GMT

Marcus wrote (In reply to comment #66):

Sorry for the vague language. What I meant was that JavaScript processing MUST be done in workers, and only in workers, never in the main thread.

Ah, OK, thanks for the clarification. I guess we do have something approaching consensus here. Jussi seemed unsure at first but given Comment #30, the doubts seem to have been dispelled. Ditto for Grant.

I will send a Call for Consensus on this to the list early next week - in particular I'd like to give ChrisR a chance to develop on what he said in Comment #3: “Some developers have expressed concerns that JavaScriptAudioNode only happens in workers.”

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 16:24:34 GMT

(In reply to comment #58)

Is the writing part of the Audio Data API --- mozSetup, mozWriteAudio, mozCurrentSampleOffset --- what you really want?

Yes

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 16:29:16 GMT

(In reply to comment #59)

(In reply to comment #51)

Option 1 does not make the situation for gapless audio any better here. We're just making it harder to push out audio. The browser knows best when to fire audio refills. Forcing the JS code to schedule audio will make audio buffering and drop outs worse.

It seems to me that you're not really interested in doing audio processing in the audio callback (which is what it was designed for). Am I right in assuming that you're looking for some kind of combination of an audio data push mechanism and a reliable event mechanism for guaranteeing that you push often enough?

AFAICT, the noteOn & AudioParam interfaces were designed for making it possible to schedule sample accurate audio actions ahead of time. I think that it should be possible to use it for providing gap-less audio playback (typically using a few AudioBuffers in a multi-buffering manner and scheduling them with AudioBufferSourceNodes). The problem, as it seems, is that you need to accommodate for possible jittering and event drops, possibly by introducing a latency (e.g, would it work if you forced a latency of 0.5s?). No, 0.5 seconds is trash. Needs to be no worse than 100 ms or it sounds like poop to the user.

Would the following be a correct conclusion?:

  • Audio processing in JavaScript should be done in workers.
  • We need a reliable main-context event system for scheduling audio actions (setInterval is not up to it, it seems). Main thread needs access to audio clock to drive audio correctly. Doing audio generation by Date object is bad design. Need actual input from the browser that we played x number of samples just now. Making me generate multiple buffers to schedule with no respect to the machine's actual buffering seems bound to fail.
olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 16:38:56 GMT

(In reply to comment #60)

  • We need a reliable main-context event system for scheduling audio actions (setInterval is not up to it, it seems).

There can't be any truly reliable way to schedule audio actions on the main thread.

I doubt you can do better than setInterval plus some way to measure audio progress, such as mozCurrentSampleOffset.

Funny story, mozCurrentSampleOffset is essentially non-functional on Linux Firefox. Timing audio off of it has been broken since it was introduced.

The callback for more samples method is best because it tells me exactly how many samples it needs. With mozCurrentSampleOffset and mozWriteAudio we were actually exposed to each OS's buffering edge cases, and still are! I have the user agent sniff for windows Firefox for instance to insert a shadow 100 ms of extra buffering otherwise windows Firefox stops the audio clock. This is in contrast to Mac Firefox being perfect at all buffer levels.

Why even a faulty mozCurrentSampleOffset is better than no buffering determination: Being forced to use time stamps to generate audio from the lack of audio clock access though would be much much worse, as things like sleep events and time zone changes will kill the timed logic.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 16:44:12 GMT

(In reply to comment #61)

(In reply to comment #52)

There's always some lag with comm to a webworker, and losing the direct ability to be called for refill on the same thread for audio makes the entire web app vulnerable to lag spikes in webworker communication. I saw this with the MediaStream API, where sync backs would be bunched up too much causing too much inconsistency and thus gaps.

I haven't looked into the worker communication lag part, but I don't really see a fundamental reason why it has to be a problem.

On the other hand, doing audio callbacks on the main thread will always be problematic, and I don't see how it could ever be solved. I envision a design where the audio mixer is running in a separate thread (or possibly multiple threads for graphs with independent processing chains). Whenever a node is encountered that needs data from a main thread callback, the mixing thread will have to halt and wait until the main thread callback has fired and finished, which is very sensitive to other things that may be going on in the main thread (e.g. a setInterval/requestAnimationFrame event).

I'd much more prefer a solution that does not require the audio mixing thread to be halted just to fire an event on the main thread.

Make a mozAudio like access to the API for the main thread. Only sync up when the main thread pushes new audio or requests for the buffering amount. Firefox is known to play samples even through GC pauses with mozAudio. It allows the js dev to pass samples to the browser AND let the browser handle the buffering for it as well without issuing callbacks.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 16:47:11 GMT

Also why using the Date object is bad: the sample rate might not be exactly what is reported, so long running synth might be affected by the error margin. Only audio clock sniffing would account for this right (mozCurrentSampleOffet and the firing of each jsaudionode event).

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 16:51:31 GMT

Being that there's clock skew for the computer's own clock and that master clocks don't always divide evenly into 44100 hertz. Long running synths without actual audio clock metrics will be affected by date and time readjustment when the computer fixes its skewed clock by contacting an atomic clock date server.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 27 Jul 2012 16:56:18 GMT

(In reply to comment #64)

(In reply to comment #62)

(In reply to comment #59)

Would the following be a correct conclusion?:

  • Audio processing in JavaScript should be done in workers. […]

I think that is a reasonable conclusion, yes - although there may be objections by e.g Philip who seemed to be quite adamant about it (See Comment #2).

Nope, I think that JavaScript processing should be done in workers, and only workers.

Problem lies with the web worker being restricted from accessing all the APIs the main thread has access too. If I house everything in the worker, doing so will mess up some other components not audio related.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Sun, 29 Jul 2012 06:54:11 GMT

(In reply to comment #69)

No, 0.5 seconds is trash. Needs to be no worse than 100 ms or it sounds like poop to the user.

Obviously. I meant it more as an exercise: How low can you practically go? (out of curiosity)

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Sun, 29 Jul 2012 16:56:03 GMT

(In reply to comment #75)

(In reply to comment #69)

No, 0.5 seconds is trash. Needs to be no worse than 100 ms or it sounds like poop to the user.

Obviously. I meant it more as an exercise: How low can you practically go? (out of curiosity)

Doesn't matter, because we would not be continuing from the previous stream, and the lack of audio clock. Firing sample buffers based on time rather than audio clock is something I would never do. I'd rather pipe in audio to the web worker and force the users of browsers that have to do that experience high copy overhead for the piping with the audio lag. I'd always rather deal with crazy audio lag than arbitrarily schedule new audio that's not driven from previous stream endings.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Sun, 29 Jul 2012 17:05:49 GMT

Let's not forget that placing the js code in a worker does not solve the blocking issue. JS code can still be very inefficient in a worker and block the worker for a long period of time. GC pauses apply too. Inefficient JS can cause blocking for longer periods than what the UI would normally block for. Putting canvas support in a worker can also block out audio as well, as non-accelerated users are very common with most browsers today.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Sun, 29 Jul 2012 17:08:56 GMT

I'd recommend mozAudio style, with no callbacks, but rather a mozCurrentSampleOffet function to determine the audio clock position. This would allow the API to not have to wait on callbacks for a fill. Blocking would only occur at the write function and offset read back function.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Sun, 29 Jul 2012 17:10:32 GMT

The mozAudio style being for a special js node that only acts as a source rather than a processing node.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Sun, 29 Jul 2012 17:14:12 GMT

If you implement a mozAudio style API as a source only node, then I'd be fine with the js audio node being locked to a worker, as long as the mozAudio style API node is accessible from the main thread too.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Mon, 30 Jul 2012 00:49:41 GMT

(In reply to comment #59)

AFAICT, the noteOn & AudioParam interfaces were designed for making it possible to schedule sample accurate audio actions ahead of time. I think that it should be possible to use it for providing gap-less audio playback (typically using a few AudioBuffers in a multi-buffering manner and scheduling them with AudioBufferSourceNodes). The problem, as it seems, is that you need to accommodate for possible jittering and event drops, possibly by introducing a latency (e.g, would it work if you forced a latency of 0.5s?).

I think a solution that uses setInterval() to schedule frequent JS callbacks, checks AudioContext.currentTime to measure the progress of the audio clock, and uses AudioBufferSourceNodes to queue playback of generated audio buffers, should work as well as any other, providing the Web Audio implementation is adequate. Incremental GC might also be required.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Mon, 30 Jul 2012 16:14:46 GMT

(In reply to comment #81)

(In reply to comment #59)

AFAICT, the noteOn & AudioParam interfaces were designed for making it possible to schedule sample accurate audio actions ahead of time. I think that it should be possible to use it for providing gap-less audio playback (typically using a few AudioBuffers in a multi-buffering manner and scheduling them with AudioBufferSourceNodes). The problem, as it seems, is that you need to accommodate for possible jittering and event drops, possibly by introducing a latency (e.g, would it work if you forced a latency of 0.5s?).

I think a solution that uses setInterval() to schedule frequent JS callbacks, checks AudioContext.currentTime to measure the progress of the audio clock, and uses AudioBufferSourceNodes to queue playback of generated audio buffers, should work as well as any other, providing the Web Audio implementation is adequate. Incremental GC might also be required.

I seriously don't like the idea of using setInterval callbacks to drive the callback procedure instead of an actual callback derived from the audio clock. I thought we were about reducing power consumption as well? Deriving callbacks from the audio clock means we don't waste checks with setInterval, as we'd need to run setInterval faster than the buffer scheduling rate versus at the rate for an audio clock derived callback. I still don't like the idea of driving with a float based timestamp, I feel chills from possible implementation errors, and one off errors.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Mon, 30 Jul 2012 16:19:51 GMT

Can we add callbacks for when a buffer is done playing? I could drive off that then. Also auto-attaching (scheduled attaching) another buffer to the end of a previously playing buffer would avoid checking currentTime in the first place.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Mon, 30 Jul 2012 21:32:31 GMT

(In reply to comment #83)

Can we add callbacks for when a buffer is done playing?

No, because they will fire too late for you to queue the next buffer and get seamless playback.

(In reply to comment #82)

I seriously don't like the idea of using setInterval callbacks to drive the callback procedure instead of an actual callback derived from the audio clock.

You're going to have to schedule timeouts early anyway to maximise the chance that your callback runs in time to schedule the next buffer seamlessly, so accuracy doesn't matter, so using the audio clock instead of the system clock doesn't matter since there will be negligible drift over the intervals we're talking about.

olivierthereaux commented 10 years ago

Original comment by Chris Rogers on W3C Bugzilla. Mon, 30 Jul 2012 23:45:01 GMT

I think there's been a misunderstanding that somehow the JavaScript code rendering audio in a JavaScriptAudioNode callback will block the audio thread! This is not the case. An implementation should use buffering (producer/consumer model) where the JS thread produces and the audio thread consumes (with no blocking). This is how it's implemented in WebKit.

Additionally, the JS callbacks should all be clocked/scheduled from the audio system (in the implementation), and not rely on setTimeout() or require client polling/querying of a timestamp from javascript (which is a much less ideal approach).

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Tue, 31 Jul 2012 03:02:13 GMT

(In reply to comment #84)

(In reply to comment #83)

Can we add callbacks for when a buffer is done playing?

No, because they will fire too late for you to queue the next buffer and get seamless playback.

(In reply to comment #82)

I seriously don't like the idea of using setInterval callbacks to drive the callback procedure instead of an actual callback derived from the audio clock.

You're going to have to schedule timeouts early anyway to maximise the chance that your callback runs in time to schedule the next buffer seamlessly, so accuracy doesn't matter, so using the audio clock instead of the system clock doesn't matter since there will be negligible drift over the intervals we're talking about.

I would be listening on a supposed finish event of buffers ahead of the last one scheduled. One wouldn't wait until the system is dry to refill. I'm talking about firing events for multiple buffers queued, so that if we run past a low point in the queue, we can prep and load new buffers. Using setInterval timers is not fool proof here, as some browsers that may implement the web audio api spec might not allow high res js timers. Numerous browsers either clamp heavily or have too much timer jitter (Opera people, I'm looking at you with that timer jitter on os x...).

crogers: I see what you're getting at. I'm just trying to make sure we all understand the audio input from the js side can be blocked inside a worker or on the main thread either way. Slow JS is slow JS.

Also how would sleep events interact with the currentTime counting? Would it be paused or would it clock with the computer's time system? For the audio to be foolproof it requires pure time clocking from the audio card/chip (So sleep events would pause it), otherwise audio buffering state can be thrown off by time changes or sleep events.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 31 Jul 2012 05:52:58 GMT

(In reply to comment #85)

I think there's been a misunderstanding that somehow the JavaScript code rendering audio in a JavaScriptAudioNode callback will block the audio thread! This is not the case. An implementation should use buffering (producer/consumer model) where the JS thread produces and the audio thread consumes (with no blocking). This is how it's implemented in WebKit.

How does this work in a subgraph similar to this?:

+------------+ +---------------------+ +------------------+ | SourceNode |----->| JavaScriptAudioNode |----->| BiquadFilterNode | +------------+ +---------------------+ +->| | | +------------------+ +------------+ +---------------------+ | | SourceNode |----->| AudioGainNode |---+ +------------+ +---------------------+

(hope this ASCII art works)

I assume that without the input from the SourceNode, the JavaScriptAudioNode will not be able to produce anything (hence its callback will not be fired until enough data is available), and likewise the BiquadFilterNode can not produce any sound until data is available from both the JavaScriptAudioNode and the AudioGainNode.

In other words, if the JavaScriptAudioNode callback in the main thread is delayed by a setInterval event, for instance, i guess that at least the BiquadFilterNode (and all nodes following it?) will need to halt until the JS callback gets fired and finished so that it has produced the necessary data for the graph to continue?

I guess that the lower part of the graph (source + gain) could produce data ahead of time while the JS node is blocking, but I assume that it could be problematic in some cases (e.g. if there are loops or other intra-node dependencies, such as a panner node somewhere controlling the pitch of the source node), so perhaps it's simpler to just let it be blocked too?

olivierthereaux commented 10 years ago

Original comment by Chris Rogers on W3C Bugzilla. Tue, 31 Jul 2012 06:20:54 GMT

(In reply to comment #87)

(In reply to comment #85)

I think there's been a misunderstanding that somehow the JavaScript code rendering audio in a JavaScriptAudioNode callback will block the audio thread! This is not the case. An implementation should use buffering (producer/consumer model) where the JS thread produces and the audio thread consumes (with no blocking). This is how it's implemented in WebKit.

How does this work in a subgraph similar to this?:

+------------+ +---------------------+ +------------------+ | SourceNode |----->| JavaScriptAudioNode |----->| BiquadFilterNode | +------------+ +---------------------+ +->| | | +------------------+ +------------+ +---------------------+ | | SourceNode |----->| AudioGainNode |---+ +------------+ +---------------------+

(hope this ASCII art works)

I assume that without the input from the SourceNode, the JavaScriptAudioNode will not be able to produce anything (hence its callback will not be fired until enough data is available), and likewise the BiquadFilterNode can not produce any sound until data is available from both the JavaScriptAudioNode and the AudioGainNode.

In other words, if the JavaScriptAudioNode callback in the main thread is delayed by a setInterval event, for instance, i guess that at least the BiquadFilterNode (and all nodes following it?) will need to halt until the JS callback gets fired and finished so that it has produced the necessary data for the graph to continue?

No, this is not the case. We're talking about a real-time system with an audio thread having realtime priority with time-constraints. In real-time systems it's very bad to block in a realtime audio thread. In fact no blocking calls are allowed in our WebKit implementation, including the taking of any locks. This is how pro-audio systems work. In your scenario, if the main thread is delayed as you describe then there will simply be a glitch due to buffer underrun in the JavaScriptAudioNode, but the other graph processing nodes which are native will continue processing smoothly. Obviously the glitch from the JavaScriptAudioNode is bad, but we already know that this can be possible due to things such as setInterval(), GC, etc. In fact, it's one of the first things I described in some detail in my spec document over two years ago. Choosing larger buffer sizes for the JavaScriptAudioNode can help alleviate this problem.

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Tue, 31 Jul 2012 07:52:15 GMT

(In reply to comment #88)

(In reply to comment #87)

(In reply to comment #85)

I think there's been a misunderstanding that somehow the JavaScript code rendering audio in a JavaScriptAudioNode callback will block the audio thread! This is not the case. An implementation should use buffering (producer/consumer model) where the JS thread produces and the audio thread consumes (with no blocking). This is how it's implemented in WebKit.

How does this work in a subgraph similar to this?:

+------------+ +---------------------+ +------------------+ | SourceNode |----->| JavaScriptAudioNode |----->| BiquadFilterNode | +------------+ +---------------------+ +->| | | +------------------+ +------------+ +---------------------+ | | SourceNode |----->| AudioGainNode |---+ +------------+ +---------------------+

(hope this ASCII art works)

I assume that without the input from the SourceNode, the JavaScriptAudioNode will not be able to produce anything (hence its callback will not be fired until enough data is available), and likewise the BiquadFilterNode can not produce any sound until data is available from both the JavaScriptAudioNode and the AudioGainNode.

In other words, if the JavaScriptAudioNode callback in the main thread is delayed by a setInterval event, for instance, i guess that at least the BiquadFilterNode (and all nodes following it?) will need to halt until the JS callback gets fired and finished so that it has produced the necessary data for the graph to continue?

No, this is not the case. We're talking about a real-time system with an audio thread having realtime priority with time-constraints. In real-time systems it's very bad to block in a realtime audio thread. In fact no blocking calls are allowed in our WebKit implementation, including the taking of any locks. This is how pro-audio systems work. In your scenario, if the main thread is delayed as you describe then there will simply be a glitch due to buffer underrun in the JavaScriptAudioNode, but the other graph processing nodes which are native will continue processing smoothly. Obviously the glitch from the JavaScriptAudioNode is bad, but we already know that this can be possible due to things such as setInterval(), GC, etc. In fact, it's one of the first things I described in some detail in my spec document over two years ago. Choosing larger buffer sizes for the JavaScriptAudioNode can help alleviate this problem.

Hmm? Convolution with big kernels is just as, if not more, suspectible to glitch as a JS node, so do you mean that if any of the nodes fails to deliver, the others still keep going?

It seems to me that the current behavior in the WebKit implementation is that if the buffer fill stops happening in time, it will start looping the previous buffer, whereas when a convolution node fails to deliver, it's just glitch and jump all over the place. Is this correct?

Seems a bit weird to treat parts of the graph differently, but I think I might have misunderstood something.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Tue, 31 Jul 2012 09:05:12 GMT

Having the audio thread never make blocking calls is good, but it leaves open the spec/API question of how a node should behave when one of its inputs can't produce data in a timely manner. It is possible to have the node pause while the input catches up, and it's possible to fill the input with silence as necessary. I think there are valid use-cases for both, especially when you throw media element and MediaStreams into the mix.

Either way, it needs to be spelled out in the spec.

olivierthereaux commented 10 years ago

Original comment by Olivier Thereaux on W3C Bugzilla. Mon, 06 Aug 2012 14:10:31 GMT

I am still rather uncomfortable with this issue, not only because we have not yet reached consensus, but because I haven't really seen the right points made on either side of the debate.

On the "must happen in Workers" side, I mostly see (in e.g Marcus' Comment #3) the idea that we should make sure not to let developers do anything bad. I am not fully confortable with that - thought our job was to enable and empower, and if need be to document best practices. There were a few mentions that limiting custom processing to workers would solve other issues (see recent thread on the delay caused by the custom nodes) - which are more interesting IMHO and should probably be developed.

On the other side of the debate, Chris mentioned a few times that developers were not comfortable with using workers - something that should be taken seriously if we are to follow our pyramid of constituencies - developers over implementors over spec designers; but IIRC we haven't really validated with a significant sample of developers that it would indeed be an issue. I know the W3C doesn't really do user testing, but that may be a good way to prove (or disprove) Chris' point.

olivierthereaux commented 10 years ago

Original comment by Philip Jägenstedt on W3C Bugzilla. Mon, 06 Aug 2012 14:46:03 GMT

(In reply to comment #91)

On the "must happen in Workers" side, I mostly see (in e.g Marcus' Comment #3) the idea that we should make sure not to let developers do anything bad. I am not fully confortable with that - thought our job was to enable and empower, and if need be to document best practices.

I don't understand what those best practices would be, since Web developers aren't in a position to avoid the problems with a main thread callback API. They cannot control what goes on in other tabs, and as I mentioned in comment 9 and comment 19, those tabs can block the main thread for longer than even the maximum buffer size allowed by JavaScriptAudioNode.

That being said, perhaps we should begin to focus on what a WorkerAudioNode would look like, since that's what we want to actually implement.

olivierthereaux commented 10 years ago

Original comment by Srikumar Subramanian (Kumar) on W3C Bugzilla. Mon, 06 Aug 2012 15:39:43 GMT

Many points have been raised for and against running JS audio code in web workers. This is an important issue for developers (like me) and so I thought having a summary of the points raised thus far might be useful [with some extra notes in square brackets]. Hope this helps reach consensus.

Please add/correct/clarify as you see fit.

== Arguments for JS audio in workers ==

  1. Audio calculation will not be interrupted by the goings on on the main thread such as GC, graphics, layout reflow, etc.
  2. Possibility for tighter integration with the audio thread. Perhaps we will be able to remove the one buffer delay currently in place? [Seems unlikely based on what ChrisR says - "cannot have blocking calls in system audio callbacks". Also cannot risk JS code in high priority system callbacks.]
  3. Permits multiple audio contexts to operate independent of each other. If all JS audio is on the main thread, then the JS nodes of all audio contexts are multiplexed on to the same thread. [Thus an offline context that uses a JS node will interfere with a realtime context if all JS audio code runs on the main thread.]
  4. With a "main thread only" design for JS audio nodes, applications will not be able to control the effects on audio of the browser multiplexing multiple tabs into a single thread - i.e. audio generation/processing can be affected by what happens in other browser tabs, with no scope for developer control.

== Arguments against JS audio in workers ==

  1. API access to JS audio code will be severely limited. We may need to lobby for access to specific APIs (ex: WebCL).
  2. More complicated to program. DOM and JS state access will require additional communication with the main thread.
  3. Event communication latency between the main thread and workers is currently high (up to 0.5s). The communication overhead is also dependent on OS characteristics and consistent behaviour between OSes may be hard to implement. [Perhaps because workers are intended for "background" tasks and so minimizing the communication latency for events from the main thread to a worker isn't a priority for browser vendors?]
  4. JS audio in workers cannot be synchronously executed with the high priority audio thread for security reasons. So the same delay in the current webkit implementation of JS audio nodes will likely apply.
  5. Some applications such as device emulators become hard or impossible to program if all JS audio were to be forced to run in workers.
  6. [If JS audio code is to be exclusively run in workers, then that demands that browser implementations cannot include support for the web audio api unless they also have support for web workers. Mobile browsers may choose not to support web workers but will likely want to support web audio.]

== Options ==

  1. Permit JS audio nodes that run in the main thread and have a separate node type to run code in workers. A suggested con is that developers will tend to take the easier route which in this case (i.e. JS on main thread) is less reliable. OTOH, "give them some rope and some will hang themselves and others will build bridges" (Jussi).
  2. It may be possible and acceptable to use AudioBufferSourceNodes to implement alternative schemes for "pumping" audio to the output, albeit at higher latencies.
  3. [Live with JS on main thread for v1 of the spec and postpone worker nodes to later versions after prototyping. JS nodes on main thread are quite usable with long enough buffers.]
olivierthereaux commented 10 years ago

Original comment by Olivier Thereaux on W3C Bugzilla. Wed, 27 Mar 2013 23:02:02 GMT

Resolution at the 2013-03-27 f2f meeting:

  1. We will add a way to use ScriptProcessorNode with workers
  2. The creation method will be similar to what was proposed in MediaStreamProcessing

See: http://www.w3.org/TR/streamproc/#stream-mixing-and-processing and example 9

document.getElementById("out").src = new ProcessedMediaStream(new Worker("synthesizer.js"));

http://people.mozilla.org/~roc/stream-demos/worker-generation.html

  1. We will not specify a buffer size, the UA will choose appropriately.
olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 29 Mar 2013 04:15:52 GMT

I remember writing code for my js audio lib to support the mediastream api and being required to output audio in a different thread: https://github.com/grantgalitz/XAudioJS/blob/master/XAudioServerMediaStreamWorker.js

It worked with an experimental firefox build that was published. Problem is, that there's up to a quarter second latency that kills. The end user of a browser will notice a giant time delay between an in-game event and the audio for it.

I have no problems with having a spec that has off thread audio, just don't cripple stuff by removing the on-thread audio. It's been mentioned many times that there is work on canvas support in-workers, so that it can be done alongside the audio out of the UI thread. The problem I have with that is it complicates keeping the audio libraries separate from the code that uses it. I want to support legacy audio APIs that use the main UI thread, and this will complicate that greatly.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Fri, 29 Mar 2013 04:19:44 GMT

And when I say "worked with," I do mean it was able to play gapless audio being streamed in from the UI thread to the worker. The only issue was latency for passing the stream around.

olivierthereaux commented 10 years ago

Original comment by Olivier Thereaux on W3C Bugzilla. Fri, 26 Jul 2013 14:49:21 GMT

Changed title of the issue to reflect renaming of JavaScriptAudioNode to ScriptProcessorNode.

olivierthereaux commented 10 years ago

Related: the TAG review has the suggestion of an “AudioWorker”. See Issue #253.

cwilso commented 10 years ago

Aren't these issues duplicates, not just related?

olivierthereaux commented 10 years ago

Yeah, they're strictly speaking duplicates. Let me copy the suggested interface here, and close #253.


The following issue was raised by the W3C TAG as part of their review of the Web Audio API

ScriptProcessorNode is Unfit For Purpose

We harbor deep concerns about ScriptProcessorNode as currently defined. Notably:

This can be repaired. Here's a stab at it:

[Constructor(DOMString scriptURL, optional unsigned long bufferSize)]
interface AudioWorker : Worker {

};

interface AudioProcessingEvent : Event {

    readonly attribute double playbackTime;

    transferrable attribute AudioBuffer buffer;

};

interface AudioWorkerGlobalScope : DedicatedWorkerGlobalScope {

  attribute EventHandler onaudioprocess;

};

interface ScriptProcessorNode : AudioNode {

  attribute EventHandler onaudioprocess;

  readonly attribute long bufferSize;

};

partial interface AudioContext {

  ScriptProcessorNode createScriptProcessor(
    DOMString scriptURL,
    optional unsigned long bufferSize = 0,
    optional unsigned long numberOfInputChannels = 2,
    optional unsigned long numberOfOutputChannels = 2);

}

The idea here is that to ensure low-latency processing, no copying of resulting buffers is done (using the Worker's Transferrable mechanism).

Scripts are loaded from external URLs and can control their inbound/outbound buffering with a constructor arguments.

Under this arrangement it's possible for the system to start to change the constraints that these scripts run under. GC can be turned off, runtime can be tightly controlled, and these scripts can even be run on the (higher priority) audio-processing thread.

All of this is necessary to ensure that scripts are not second-class citizens in the architecture; attractive nuisances which can't actually be used in the real world due to their predictable down-sides.