WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.04k stars 165 forks source link

Audio Workers #113

Closed olivierthereaux closed 7 years ago

olivierthereaux commented 10 years ago

Originally reported on W3C Bugzilla ISSUE-17415 Tue, 05 Jun 2012 12:43:20 GMT Reported by Michael[tm] Smith Assigned to

Audio-ISSUE-107 (JSWorkers): JavaScriptAudioNode processing in workers [Web Audio API]

http://www.w3.org/2011/audio/track/issues/107

Raised by: Marcus Geelnard On product: Web Audio API

https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#JavaScriptAudioNode

It has been discussed before (see [1] and [2], for instance), but I could not find an issue for it, so here goes:

The JavaScriptAudioNode should do its processing in a separate context (e.g. a worker) rather than in the main thread/context. It could potentially mean very low overhead for JavaScript-based audio processing, and seems to be a fundamental requirement for making the JavaScriptAudioNode really useful.

[1] http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0225.html [2] http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0245.html

olivierthereaux commented 10 years ago

Original comment by Olivier Thereaux on W3C Bugzilla. Thu, 07 Jun 2012 15:40:47 GMT

Work in progress. See http://www.w3.org/2011/audio/track/actions/26

olivierthereaux commented 10 years ago

Original comment by Philip Jägenstedt on W3C Bugzilla. Fri, 08 Jun 2012 09:28:57 GMT

We haven't been clear enough on this. What we want is for JavaScript processing to happen only in workers. Doing anything on the same context as mouse and keyboard events are processed and where scripts can easily be blocked for 100s of milliseconds by layout reflows is simply a no-go.

olivierthereaux commented 10 years ago

Original comment by Chris Rogers on W3C Bugzilla. Fri, 08 Jun 2012 19:19:15 GMT

I completely agree that workers are a better approach for lower-latency, smaller buffer sizes. But there is a cost to the developer to being required to use a web worker because the JavaScript state is completely isolated from the main JS thread. Thus it will require more complex code, and some applications might not even be practical.

Some developers have expressed concerns that JavaScriptAudioNode only happens in workers.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Mon, 11 Jun 2012 15:00:14 GMT

While I agree that workers are slightly more cumbersome to work with than regular callbacks, I think that there are some risks with supporting both methods:

1) It's generally confusing to have two options for doing the same thing. It's quite likely that developers will pick the "wrong" solution just because they copied from an example or tutorial that used the alternative that wasn't optimal for the actual use case at hand.

2) I suspect that the callback based approach will be more sensitive to browser/system variations (e.g. different painting/event/animation architectures), which means that it's more likely that someone will design an app on his/hers system+browser combination of preference, and then it will suffer from latency problems on another system+browser combination. This is less likely to be the case if people are always forced to use workers, where the problem should be less pronounced.

3) Since the non-worker option is generally simpler to grasp, it's more likely to be used in more apps than it should.

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Mon, 11 Jun 2012 15:22:55 GMT

While I agree that it's not a good idea to do time-critical heavy lifting like audio on the main thread, sometimes there isn't much choice: emulators and other virtual machines, ports of existing code (more possible but could be very difficult as the original code may have strong shared state with the audio code), and such.

These kind of programs are quite challenging as it is, I don't think a few bad eggs should make the lives of those developers worse than it already is by having to do expensive tricks like sending the audio data to the worker with postMessage and maintaining the callback system themselves.

I always find these discussions about creating bad practices a bit frustrating, people will make bad choices, no matter how well we design things. For me, it doesn't mean that we shouldn't try to avoid making things so that developers want to do bad things, but actively making actual use cases harder to keep some people from making stupid decisions is counter-productive, IMHO.

You give them the rope, some will hang themselves, some will build a bridge.

olivierthereaux commented 10 years ago

Original comment by Philip Jägenstedt on W3C Bugzilla. Tue, 12 Jun 2012 14:59:53 GMT

(In reply to comment #5)

expensive tricks like sending the audio data to the worker with postMessage and maintaining the callback system themselves

That's not how a worker-based AudioNode would work, it would be a callback in the worker that can read directly from the input and write directly to the output.

There are things on the main thread that are not interruptible (layout and event handlers being the most obvious) so it's only luck if one is able to run the callback often enough. I can't speak for any other implementors, but I'm fairly certain it would fail horribly in Opera, as other pages running in the same process can't be expected to write code to avoid long-running scripts or expensive re-layouts.

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Tue, 12 Jun 2012 15:36:12 GMT

That's not how a worker-based AudioNode would work, it would be a callback in the worker that can read directly from the input and write directly to the output.

Exactly, and if the audio processing takes place in the main thread, you have no way of knowing when the callbacks in the worker occur. Hence you have to devise your own callback system to sync with the one going on in the worker, and send data over to the worker using postMessage, being a very ineffective solution for a case that's already very vulnerable. Not to mention that it's difficult to implement without ending up with weird edge-case race conditions.

I'm fairly certain it would fail horribly in Opera, as other pages running in the same process can't be expected to write code to avoid long-running scripts or expensive re-layouts.

Of course, it's impossible to predict what's going on other pages, but that applies to drawing and other things as well, to achieve the best results, users have to close other tabs unless the browser has some multi-threading going on with different tabs.

But I beg to differ that Opera would fail horribly. In my sink.js [1](a library to allow raw access to audio cross-browser), I have a fallback using the audio tag, you can take a look at an example of how it runs on Opera here [2](a demo I made for last Christmas). The result is bearable, even though wav conversion and data URI conversion sucks the CPU dry. There are glitches every 0.5s or so, due to switching the audio tag but that's only because the onended event triggering the next clip fires a significant time after the audio has actually finished playing.

[1] https://github.com/jussi-kalliokoski/sink.js [2] http://audiolibjs.org/examples/bells.html

olivierthereaux commented 10 years ago

Original comment by Philip Jägenstedt on W3C Bugzilla. Wed, 13 Jun 2012 09:59:16 GMT

(In reply to comment #7)

That's not how a worker-based AudioNode would work, it would be a callback in the worker that can read directly from the input and write directly to the output.

Exactly, and if the audio processing takes place in the main thread, you have no way of knowing when the callbacks in the worker occur. Hence you have to devise your own callback system to sync with the one going on in the worker, and send data over to the worker using postMessage, being a very ineffective solution for a case that's already very vulnerable. Not to mention that it's difficult to implement without ending up with weird edge-case race conditions.

The solution is to not do audio processing in the main thread and to post the state needed to do it in the worker instead. This seems trivial to me, do you have a real-world example where it is not?

I'm fairly certain it would fail horribly in Opera, as other pages running in the same process can't be expected to write code to avoid long-running scripts or expensive re-layouts.

Of course, it's impossible to predict what's going on other pages, but that applies to drawing and other things as well, to achieve the best results, users have to close other tabs unless the browser has some multi-threading going on with different tabs.

But I beg to differ that Opera would fail horribly. In my sink.js [1](a library to allow raw access to audio cross-browser), I have a fallback using the audio tag, you can take a look at an example of how it runs on Opera here [2](a demo I made for last Christmas). The result is bearable, even though wav conversion and data URI conversion sucks the CPU dry. There are glitches every 0.5s or so, due to switching the audio tag but that's only because the onended event triggering the next clip fires a significant time after the audio has actually finished playing.

[1] https://github.com/jussi-kalliokoski/sink.js [2] http://audiolibjs.org/examples/bells.html

That's really cool, but not at all the same. If you generate 500 ms second chunks of audio, blocking the main thread with layout for 100 ms is not a problem. With the current JavaScriptAudioNode the block size can go as small as 256, which is only 5 ms at 48Khz. To never block the main thread for more than 5 ms is not a guarantee we can make.

olivierthereaux commented 10 years ago

Original comment by Philip Jägenstedt on W3C Bugzilla. Wed, 13 Jun 2012 10:16:37 GMT

To get an idea of how long a layout reflow can take, visit http://www.whatwg.org/specs/web-apps/current-work/ wait for it to load and then run javascript:alert(opera.reflowCount + ' reflows in ' + opera.reflowTime + ' ms')

On my very fast developer machine, the results are:

22 reflows in 19165.749624967575 ms

That means that in the best case (all reflows took the same amount of time) the longest reflow was 871 ms.

With https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html I got these results:

7 reflows in 79.2522840499878 ms

Unfortunately the longest reflow isn't exposed, but even if it's "only" 12 ms that means that it's not hard to go way beyond that on a slower machine, like a smartphone.

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Wed, 13 Jun 2012 15:42:55 GMT

(In reply to comment #8)

(In reply to comment #7)

That's not how a worker-based AudioNode would work, it would be a callback in the worker that can read directly from the input and write directly to the output.

Exactly, and if the audio processing takes place in the main thread, you have no way of knowing when the callbacks in the worker occur. Hence you have to devise your own callback system to sync with the one going on in the worker, and send data over to the worker using postMessage, being a very ineffective solution for a case that's already very vulnerable. Not to mention that it's difficult to implement without ending up with weird edge-case race conditions.

The solution is to not do audio processing in the main thread and to post the state needed to do it in the worker instead. This seems trivial to me, do you have a real-world example where it is not?

No, I'm afraid I don't, and there probably aren't too many around (anymore, developers know better these days). But the emulator case still stands, there are already emulator environments written in JS. [1] [2] [3] [4]

That's really cool, but not at all the same. If you generate 500 ms second chunks of audio, blocking the main thread with layout for 100 ms is not a problem. With the current JavaScriptAudioNode the block size can go as small as 256, which is only 5 ms at 48Khz. To never block the main thread for more than 5 ms is not a guarantee we can make.

Obviously the developers need to adjust their buffer sizes to work in the main thread, buffer sizes of 256 samples in the main thread with JS are (for now) a bit unrealistic — given the complexity and threading that comes with this API — and fails currently in Chrome as well (although I'm not sure why, the CPU usage is only 2% or so, but I suppose this is a thread communication latency issue), in fact any buffer size under 2048 makes the JSNode glitch horribly. If the developer expects her application to work in a mobile phone as well, she'll have to adjust that buffer size further. Indeed, I once proposed that the buffer size argument of the JSNode would be optional so that the browser could make a best approximation of what kind of a buffer size could be handled on a given setup. [5]

It's not like we can prevent people from doing their audio processing in the main thread. What we can do, however, is give them proper tools to do that in a minimally disruptive way for the user experience.

[1] http://fir.sh/projects/jsnes/ [2] http://gamecenter.grantgalitz.org/ [3] http://www.kingsquare.nl/jsc64 [4] http://bellard.org/jslinux/ (has no use case for audio - yet) [5] http://lists.w3.org/Archives/Public/public-audio/2012AprJun/0106.html

olivierthereaux commented 10 years ago

Original comment by Olli Pettay on W3C Bugzilla. Wed, 13 Jun 2012 16:06:02 GMT

Not quite the same thing as audio processing, but we're trying to limit all the new XHR features in main thread to async only. Sync is occasionally easier to use, but it will just fail (cause bad user experience) in the main thread. (And yes, the change to limit certain sync behavior to workers only broke various libraries.)

Similarly, JS audio processing is guaranteed to fail on the main thread in reasonable common cases. So, IMHO, all the JS audio processing should happen in background threads.

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Wed, 13 Jun 2012 18:07:28 GMT

(In reply to comment #11)

Not quite the same thing as audio processing, but we're trying to limit all the new XHR features in main thread to async only. Sync is occasionally easier to use, but it will just fail (cause bad user experience) in the main thread. (And yes, the change to limit certain sync behavior to workers only broke various libraries.)

Similarly, JS audio processing is guaranteed to fail on the main thread in reasonable common cases. So, IMHO, all the JS audio processing should happen in background threads.

I agree it should, but I don't think it will. What should an emulator/VM developer do? Render off main-thread as well? MSP API would have been a perfect fit for that use case, given it's ability to process video as well... Analyzing the byte code of those existing games and other programs and isolating the audio code to another thread doesn't sound very feasible.

olivierthereaux commented 10 years ago

Original comment by Olli Pettay on W3C Bugzilla. Wed, 13 Jun 2012 18:14:29 GMT

(In reply to comment #12)

I agree it should, but I don't think it will. What should an emulator/VM developer do? Render off main-thread as well? Probably

MSP API would have been a perfect fit for that use case, given it's ability to process video as well... Analyzing the byte code of those existing games and other programs and isolating the audio code to another thread doesn't sound very feasible. That is a limitation in the WebAudio API which should be fixed.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Thu, 14 Jun 2012 00:49:52 GMT

There has been a lot of discussion about a Worker-accessible canvas API and I expect one to be created fairly soon. Then we'll have a solution for VMs and emulators that really works. I agree that providing main-thread JS audio processing is just setting authors up to fail.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Fri, 15 Jun 2012 12:58:46 GMT

(In reply to comment #12)

I agree it should, but I don't think it will. What should an emulator/VM developer do? Render off main-thread as well?

I guess it depends a lot on what kind of system you want to emulate, and to what extent you need CPU cycle exact state coherence (e.g. do you want to emulate a few popular games, or do you want to implement a fully functional virtualization of a machine?).

For instance for implementing a SID chip for a C=64 emulator, I'd imagine that you can simply post time-stamped messages from the main thread to an audio worker (possibly batched as a time-stamped command buffer per frame or whatever), where the worker implements all SID logic. A similar solution should work for the NES sound HW too, I guess.

For emulating the Paula chip on the Amiga, you'd have more problems since it uses DMA for accessing CPU-shared memory. On the other hand, I think you should be able to come quite far by setting up a node graph that effectively emulates the Paula chip using automation for timing etc, eliminating a lot of the problems that you would otherwise have with a 100% JS-based mixer.

In any event, this does strike me as a show-stopper for having audio processing in workers. Especially given that machine emulators are usually quite odd, both in terms of architecture and actual use cases.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Fri, 15 Jun 2012 13:06:12 GMT

(In reply to comment #15)

In any event, this does strike me as a show-stopper for having audio processing in workers.

As Philip pointed out: should read "this does NOT strike me as a show-stopper" ;)

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Fri, 15 Jun 2012 13:23:44 GMT

(In reply to comment #16)

(In reply to comment #15)

In any event, this does strike me as a show-stopper for having audio processing in workers.

As Philip pointed out: should read "this does NOT strike me as a show-stopper" ;)

All right, I'm running out of points to defend my case, especially since I don't have a horse of my own in the race. :) And if it's one or the other, I prefer the audio processing is done only in workers instead of the main thread (obviously), but I still think it'd be wise to have both.

olivierthereaux commented 10 years ago

Original comment by Chris Rogers on W3C Bugzilla. Fri, 15 Jun 2012 21:49:06 GMT

(In reply to comment #17)

(In reply to comment #16)

(In reply to comment #15)

In any event, this does strike me as a show-stopper for having audio processing in workers.

As Philip pointed out: should read "this does NOT strike me as a show-stopper" ;)

All right, I'm running out of points to defend my case, especially since I don't have a horse of my own in the race. :) And if it's one or the other, I prefer the audio processing is done only in workers instead of the main thread (obviously), but I still think it'd be wise to have both.

I agree with Jussi. Quite honestly it's a lot simpler for developers to have access to the complete JS state while doing the processing. If people are willing to work with larger buffer sizes, then quite reasonable things can be done in the main thread.

olivierthereaux commented 10 years ago

Original comment by Philip Jägenstedt on W3C Bugzilla. Mon, 18 Jun 2012 11:23:13 GMT

If so, then "larger buffer sizes" should be a hard requirement. On my fairly powerful desktop computer a layout could block for at least 871 ms. The closest power-of-two at 48Khz is 65536, i.e. over a second. With that amount of latency it doesn't seems very useful.

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Mon, 18 Jun 2012 18:56:02 GMT

(In reply to comment #19)

If so, then "larger buffer sizes" should be a hard requirement. On my fairly powerful desktop computer a layout could block for at least 871 ms. The closest power-of-two at 48Khz is 65536, i.e. over a second. With that amount of latency it doesn't seems very useful.

What? Why would it be a hard limit? Hard limits aren't very future-friendly. Should setTimeout have a minimum timeout limit of 871ms as well? Or requestAnimationFrame?

Developers have to be conscious about performance and avoiding layout reflows anyway, why should this API be any different?

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Mon, 18 Jun 2012 23:39:06 GMT

(In reply to comment #20)

Developers have to be conscious about performance and avoiding layout reflows anyway, why should this API be any different?

One problem is that developers don't control all the pages that might possibly be sharing the same thread. No browser puts every page on its own thread. So even if you write your page perfectly, you're still vulnerable to latency caused by poorly-written pages sharing your main thread.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 19 Jun 2012 07:11:37 GMT

(In reply to comment #20)

Developers have to be conscious about performance and avoiding layout reflows anyway, why should this API be any different?

I'd also like to add to this discussion that you can't really compare glitches in graphics/animation to glitches in audio. In general we (humans) are much more sensitive to glitches in audio than to frame drops in animation. You can usually get away with a 100 ms loss in an animation every now and then, but you can't as easily get away with a 1 ms glitch in your audio.

Most systems (DVD, DVB etc) prioritize audio over video. This can be seen when switching channels on some TV boxes for instance, where video stutters into sync with the continuous audio - it's hardly noticeable, but it would be horrible if it was the other way around (stuttering audio).

In other words, an audio API should provide continuous operation even under conditions when a graphics API fail to do so.

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Tue, 19 Jun 2012 14:41:03 GMT

(In reply to comment #21)

(In reply to comment #20)

Developers have to be conscious about performance and avoiding layout reflows anyway, why should this API be any different?

One problem is that developers don't control all the pages that might possibly be sharing the same thread. No browser puts every page on its own thread. So even if you write your page perfectly, you're still vulnerable to latency caused by poorly-written pages sharing your main thread.

Of course, but this argument is just as valid against having an audio API at all, after all the developer can't anticipate what else is running on the user's computer aside from the browser. For all the developer knows, the API might be running on a mobile browser with all cores (or maybe just one) busy. Throwing more threads at it doesn't necessarily solve the problem of not being able to anticipate all situations.

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Tue, 19 Jun 2012 14:43:58 GMT

(In reply to comment #22)

(In reply to comment #20)

Developers have to be conscious about performance and avoiding layout reflows anyway, why should this API be any different?

I'd also like to add to this discussion that you can't really compare glitches in graphics/animation to glitches in audio. In general we (humans) are much more sensitive to glitches in audio than to frame drops in animation. You can usually get away with a 100 ms loss in an animation every now and then, but you can't as easily get away with a 1 ms glitch in your audio.

Most systems (DVD, DVB etc) prioritize audio over video. This can be seen when switching channels on some TV boxes for instance, where video stutters into sync with the continuous audio - it's hardly noticeable, but it would be horrible if it was the other way around (stuttering audio).

In other words, an audio API should provide continuous operation even under conditions when a graphics API fail to do so.

Yes, this is why it is preferable to run audio in a real time / priority thread where possible, but it's not always possible, maybe due to the system or the nature of the application.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 19 Jun 2012 15:00:28 GMT

(In reply to comment #23)

(In reply to comment #21) Of course, but this argument is just as valid against having an audio API at all, after all the developer can't anticipate what else is running on the user's computer aside from the browser. For all the developer knows, the API might be running on a mobile browser with all cores (or maybe just one) busy. Throwing more threads at it doesn't necessarily solve the problem of not being able to anticipate all situations.

True, but there is a significant difference between running several threads on a single core (preemptive scheduling should give any thread CPU quite often), and running several pages in a single thread (a callback may have to wait for seconds).

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Tue, 19 Jun 2012 15:08:43 GMT

(In reply to comment #25)

(In reply to comment #23)

(In reply to comment #21) Of course, but this argument is just as valid against having an audio API at all, after all the developer can't anticipate what else is running on the user's computer aside from the browser. For all the developer knows, the API might be running on a mobile browser with all cores (or maybe just one) busy. Throwing more threads at it doesn't necessarily solve the problem of not being able to anticipate all situations.

True, but there is a significant difference between running several threads on a single core (preemptive scheduling should give any thread CPU quite often), and running several pages in a single thread (a callback may have to wait for seconds).

Still, no matter what we do, in some cases audio will not work as expected, will miss refills, and there's nothing we can do about it. What we can do, however, is give the proper tools to handle these situations, including main thread audio processing that doesn't have to resort to manually transferring audio to a worker (or graphics to the main thread either), because that's even more expensive and likely to fail.

olivierthereaux commented 10 years ago

Original comment by Olli Pettay on W3C Bugzilla. Tue, 19 Jun 2012 15:10:21 GMT

(In reply to comment #26)

Still, no matter what we do, in some cases audio will not work as expected, will miss refills, and there's nothing we can do about it. What we can do, however, is give the proper tools to handle these situations, Yes

including main thread audio processing Why. This is what we should probably explicitly prevent to increase the likelihood for well-designed apps.

olivierthereaux commented 10 years ago

Original comment by Tony Ross [MSFT] on W3C Bugzilla. Tue, 19 Jun 2012 22:51:35 GMT

I definitely favor helping developers "do the right thing", so I also prefer focusing exclusively on JavaScriptAudioNode processing in workers (instead of the main thread). Regardless, we all seem to agree that support for doing processing in workers needs added.

If we do decide to support both, I suggest requiring some sort of explicit opt-in for running on the main thread. This way developers would at least be less likely to gravitate to it by default.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Tue, 19 Jun 2012 23:15:33 GMT

(In reply to comment #26)

Still, no matter what we do, in some cases audio will not work as expected, will miss refills, and there's nothing we can do about it.

With JS audio produced in Workers, the browser should be able to make audio work reliably in any situation short of complete overload of the device.

With JS audio on the main thread, audio will start failing as soon as a badly-behaving page happens to share the same main thread as the audio page.

That is a huge difference.

What we can do, however, is give the proper tools to handle these situations, including main thread audio processing that doesn't have to resort to manually transferring audio to a worker (or graphics to the main thread either), because that's even more expensive and likely to fail.

For use-cases such as "run an emulator producing sound and graphics", the best solution is to provide Worker access to canvas as well as audio. Then you can have reliable audio and a reliable frame rate as well.

Are there any other use-cases that are problematic for audio production in Workers?

olivierthereaux commented 10 years ago

Original comment by Jussi Kalliokoski on W3C Bugzilla. Wed, 20 Jun 2012 01:14:21 GMT

(In reply to comment #29)

(In reply to comment #26)

Still, no matter what we do, in some cases audio will not work as expected, will miss refills, and there's nothing we can do about it.

With JS audio produced in Workers, the browser should be able to make audio work reliably in any situation short of complete overload of the device.

With JS audio on the main thread, audio will start failing as soon as a badly-behaving page happens to share the same main thread as the audio page.

That is a huge difference.

What we can do, however, is give the proper tools to handle these situations, including main thread audio processing that doesn't have to resort to manually transferring audio to a worker (or graphics to the main thread either), because that's even more expensive and likely to fail.

For use-cases such as "run an emulator producing sound and graphics", the best solution is to provide Worker access to canvas as well as audio. Then you can have reliable audio and a reliable frame rate as well.

Are there any other use-cases that are problematic for audio production in Workers?

If Workers get access to canvas, at least I can't think of a (valid) reason why anyone would want to process audio in the main thread. :)

olivierthereaux commented 10 years ago

Original comment by Wei James on W3C Bugzilla. Wed, 20 Jun 2012 01:53:58 GMT

(In reply to comment #28)

I definitely favor helping developers "do the right thing", so I also prefer focusing exclusively on JavaScriptAudioNode processing in workers (instead of the main thread). Regardless, we all seem to agree that support for doing processing in workers needs added. If we do decide to support both, I suggest requiring some sort of explicit opt-in for running on the main thread. This way developers would at least be less likely to gravitate to it by default.

+1

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Sun, 08 Jul 2012 06:24:36 GMT

Yeah, I can tell this talk about an emulator producing audio on the main thread and sending off the audio data to a worker is related to JS GBC using XAudioJS with the MediaStream Processing API in use. :P

I personally have to use the main thread for compatibility reasons with legacy browsers lacking web worker support. Essentially creating a second version of the emulator for webworker capable browsers seems like a big hack. Sending audio off from the main thread to the worker is very easy to do, but the i/o lag is off the charts (almost a half a second in some cases), as seen with experimentation with the mediastream api.

To see what it would look like to sync audio from the main thread to the worker: main thread: https://github.com/grantgalitz/XAudioJS/blob/master/XAudioServer.js#L142 and https://github.com/grantgalitz/XAudioJS/blob/master/XAudioServer.js#L406

worker: https://github.com/grantgalitz/XAudioJS/blob/master/XAudioServerMediaStreamWorker.js

The question to be asked: Why not allow the js developer to select either a web worker or main thread usage for outputting audio? Locking audio to one or the other seems to only limit options here.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Sun, 08 Jul 2012 06:30:44 GMT

The whole discussion about the main thread being very poor for audio streaming seems to be somewhat overplayed. Proper web applications that use the main thread should ration their event queue properly. We already have better audio stream continuity than some native apps: http://www.youtube.com/watch?v=H7vt5svSJiE

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Sun, 08 Jul 2012 06:46:59 GMT

For use-cases such as "run an emulator producing sound and graphics", the best solution is to provide Worker access to canvas as well as audio. Then you can have reliable audio and a reliable frame rate as well.

What about lack of hardware acceleration? That'll block audio as long as gfx is on the same thread as audio, which in my case is a must for low lag sync.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Sun, 08 Jul 2012 22:14:53 GMT

(In reply to comment #33)

The whole discussion about the main thread being very poor for audio streaming seems to be somewhat overplayed. Proper web applications that use the main thread should ration their event queue properly.

See comment #21.

(In reply to comment #34)

For use-cases such as "run an emulator producing sound and graphics", the best solution is to provide Worker access to canvas as well as audio. Then you can have reliable audio and a reliable frame rate as well.

What about lack of hardware acceleration? That'll block audio as long as gfx is on the same thread as audio, which in my case is a must for low lag sync.

If you must have audio on the same thread as graphics, and that causes glitches because the device is simply overloaded, then so be it; that is unavoidable. See comment #29.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Mon, 09 Jul 2012 02:36:50 GMT

An even bigger blocker for me actually is the keyboard events. I cooked up a webworker based version of JS GBC long ago and noticed at the time that I had to pass keyboard event notifications to the webworker, causing one to experience lag with input as well. It's the being forced to pipe so much from the UI to the worker that makes webworker usage for my case entirely pointless at the moment. A webworker is going to need access to almost as many APIs as the main thread if we're going to perform real time processing with low latency that involves various forms of dynamic input. Adding thousands of lines of code to make a single-threaded application perform worse in a web worker versus the UI thread seems silly to me.

Also the UI thread being blocked from other tabs is experienced on a per-browser basis for me at least, with Chrome in the clear actually. I do experience things like js timers dropping from 4 ms to 500 ms intervals when I do things like try to hover over the mac os x dock or change the volume, so forcing multi-tasking within the js environment via workers seems like the "fix" for firefox at least.

olivierthereaux commented 10 years ago

Original comment by Robert O'Callahan (Mozilla) on W3C Bugzilla. Mon, 09 Jul 2012 03:14:26 GMT

(In reply to comment #36)

Also the UI thread being blocked from other tabs is experienced on a per-browser basis for me at least, with Chrome in the clear actually.

It may depend on how many tabs you have open and what they contain, but sooner or later Chrome will put multiple tabs onto the same main thread.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Mon, 09 Jul 2012 05:16:37 GMT

(In reply to comment #37)

(In reply to comment #36)

Also the UI thread being blocked from other tabs is experienced on a per-browser basis for me at least, with Chrome in the clear actually.

It may depend on how many tabs you have open and what they contain, but sooner or later Chrome will put multiple tabs onto the same main thread.

True

To sum up my argument: We are currently not provided all the resources inside a webworker to attain full independence from the main thread yet, and dependence on the main thread kills us with i/o lag.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Mon, 09 Jul 2012 09:25:04 GMT

(In reply to comment #38)

To sum up my argument: We are currently not provided all the resources inside a webworker to attain full independence from the main thread yet, and dependence on the main thread kills us with i/o lag.

A possible solution for emulators like the JS GBC emulator (which I really like!): move the audio emulation part to a Web worker (making it independent from the main thread), and post time-stamped audio HW register writes from the main thread to the audio Web worker (should be quite compact data?). That way you would be glitch free even in cases of i/o lag. This assumes that you can do the audio HW emulation independently from the rest of the machine, but for simple "8-bit" sound HW, I think it can easily be done.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Mon, 09 Jul 2012 17:27:20 GMT

(In reply to comment #39)

(In reply to comment #38)

To sum up my argument: We are currently not provided all the resources inside a webworker to attain full independence from the main thread yet, and dependence on the main thread kills us with i/o lag.

A possible solution for emulators like the JS GBC emulator (which I really like!): move the audio emulation part to a Web worker (making it independent from the main thread), and post time-stamped audio HW register writes from the main thread to the audio Web worker (should be quite compact data?). That way you would be glitch free even in cases of i/o lag. This assumes that you can do the audio HW emulation independently from the rest of the machine, but for simple "8-bit" sound HW, I think it can easily be done.

One possible problem with that is we would need to synchronize up to once every two clocks at a sample rate of 4194304 hertz (One sample per clock cycle for LLE emulation). We also would need to synchronize back part of the audio state, as we expose some of the state back in the emulated hardware registers.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 10 Jul 2012 07:12:12 GMT

(In reply to comment #40)

One possible problem with that is we would need to synchronize up to once every two clocks at a sample rate of 4194304 hertz (One sample per clock cycle for LLE emulation).

My idea here would be to use time-stamped commands (such as CYCLE:REGISTER=VALUE), and batch up the commands in a typed array buffer that is flushed (sent to the worker) once per frame for example. The worker could run at its own clock with a small latency w.r.t the main thread (I doubt that it would be noticeable).

We also would need to synchronize back part of the audio state, as we expose some of the state back in the emulated hardware registers.

Read-back is trickier of course. I know very little about the GB/GBC hardware. My experience is mostly from the SID chip (from the C=64), which allows you to read back values from one of the oscillators. Here, I think I would just emulate a very small sub-set of the SID chip in the main thread (those readable registers were almost never used, and if they were used, it would typically be for random number generation).

Anyway, these were just some ideas. Perhaps worth trying, perhaps not...

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Tue, 10 Jul 2012 09:32:21 GMT

(In reply to comment #41)

(In reply to comment #40)

One possible problem with that is we would need to synchronize up to once every two clocks at a sample rate of 4194304 hertz (One sample per clock cycle for LLE emulation).

My idea here would be to use time-stamped commands (such as CYCLE:REGISTER=VALUE), and batch up the commands in a typed array buffer that is flushed (sent to the worker) once per frame for example. The worker could run at its own clock with a small latency w.r.t the main thread (I doubt that it would be noticeable).

We also would need to synchronize back part of the audio state, as we expose some of the state back in the emulated hardware registers.

Read-back is trickier of course. I know very little about the GB/GBC hardware. My experience is mostly from the SID chip (from the C=64), which allows you to read back values from one of the oscillators. Here, I think I would just emulate a very small sub-set of the SID chip in the main thread (those readable registers were almost never used, and if they were used, it would typically be for random number generation).

Anyway, these were just some ideas. Perhaps worth trying, perhaps not...

Except latency for reading back would quite literally kill the emulator. Also the emulator checks how many samples it's under running by in real time do it can run more clocks over the base clocking amount to make sure we actually run 100% speed without this speed adjustment via listening to the sample count, browsers would be underrunning the audio like crazy, as all browsers cheat setInterval timing it seems (aka never call it enough even with extra CPU time free). This was a major reason for the single threaded design, as async buffering from the main thread causes us lag in the speed determination portion, which is unacceptable. It's this measurement that also allows us to prioritize CPU over graphics when hitting the 100% CPU usage wall. We do frame skip by holding off blitting until the end of iteration.

olivierthereaux commented 10 years ago

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 10 Jul 2012 09:48:29 GMT

(In reply to comment #42)

...

Fair enough.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Wed, 11 Jul 2012 00:15:23 GMT

A bit off topic, but could we add nodes that generate specific waveforms? Like a sine/triangle/square/sawtooth/LSFR White Noise generator node?

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Wed, 11 Jul 2012 00:17:49 GMT

Typo: meant to say LFSR instead ( http://en.wikipedia.org/wiki/Linear_feedback_shift_register )

Reducing the bitwidth of the LSFR results in interesting effects some sound devs might like. As a bonus, programmable audio has a very download low bandwidth cost, so it's optimal for instantly loading apps.

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Wed, 11 Jul 2012 00:21:14 GMT

And obviously an arbitrary length waveform buffer bank node would be awesome as well (So the dev can create custom waveforms themselves to repeat over and over).

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Wed, 11 Jul 2012 00:22:26 GMT

I know we have normal nodes for audio buffers, but can we do direct access on them?

olivierthereaux commented 10 years ago

Original comment by Olivier Thereaux on W3C Bugzilla. Wed, 11 Jul 2012 07:21:37 GMT

Hi Grant,

(In reply to comment #44)

A bit off topic, but could we add nodes that generate specific waveforms? Like a sine/triangle/square/sawtooth/LSFR White Noise generator node?

I think you are looking for https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Oscillator

Ideally, if you have any comment not related to a specific issue, discussion on public-audio@w3.org is a better idea than hijacking the comment thread ;)

olivierthereaux commented 10 years ago

Original comment by Grant Galitz on W3C Bugzilla. Wed, 11 Jul 2012 15:39:09 GMT

(In reply to comment #48)

Hi Grant,

(In reply to comment #44)

A bit off topic, but could we add nodes that generate specific waveforms? Like a sine/triangle/square/sawtooth/LSFR White Noise generator node?

I think you are looking for https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Oscillator

Ideally, if you have any comment not related to a specific issue, discussion on public-audio@w3.org is a better idea than hijacking the comment thread ;)

Oh wow, I never noticed you had that implemented already. I'm wondering where the white noise osc is?

olivierthereaux commented 10 years ago

Original comment by Philip Jägenstedt on W3C Bugzilla. Thu, 26 Jul 2012 12:57:57 GMT

Grant, it seems to me that there are at least two options for main-thread audio generation even if there's no JavaScriptAudioNode.

  1. Generate your audio into AudioBuffers and schedule these to play back-to-back with AudioBufferSoruceNodes. (I haven't tried if the WebKit implementation handles this gapless, but I don't see why we shouldn't support this in the spec.)
  2. Generate your audio into AudioBuffers and postMessage these to a WorkerAudioNode. If ownership of the buffer is transferred it should be cheap and there's no reason why this should incur a large delay, particularly not half a second like you've seen. That sounds like a browser bug to be fixed.

In both cases one will have one new object per buffer to GC, in the first case it's a AudioBufferSourceNode and in the second case it's the event object on the worker side.