WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.04k stars 166 forks source link

Allow setting the Convolution buffer asynchronously #2449

Open padenot opened 5 years ago

padenot commented 5 years ago

Implementations implement the buffer setter of the ConvolverNode in two ways:

I've tried to make this computation off-main-thread in BMO#1572878, but it's not satisfactory, it can cause a delay in the processing, and the API looks synchronous. Nothing in the spec says it needs to be synchronous (and the buffer is acquired, an operation that is usually performed only if the buffer is sent to another thread), but my patch causes web-platform-test failures, and Karl demonstrated that it can cause more discontinuities in the output than the current setup.

Adding a member "solves" this:

partial interface ConvolverNode {
    Promise<void> setBuffer(AudioBuffer buffer);
}

This allows to compute the impulse off-main-thread, and allows authors to know when the ConvolverNode is ready to process with the new impulse.

On my (very powerful, Intel(R) Core(TM) i9-7940X CPU @ 3.10GHz) machine, running Firefox Nightly on Linux, the rate at which the impulse is prepared is around 15 milliseconds of computation for a second of impulse (35ms for Chrome). This means that:

Trying to find real data about the size of convolution impulses, I looked at the large number of impulses that are by default in the "Convolution Reverb Pro" Max4Live patch, and it's clear that a good chunk of them are in the high single digits. A dozen or so are longer than 10 seconds, some are 30 seconds (and it's a real place, in this case the Taj Mahal), so this is not a theoretical problem.

Test case at https://paul.cx/public/conv.html, also below for posterity:

<meta charset=utf-8>
<h1>Benchmark: how long does it take to prepare a convolution impulse in the Web
  Audio API</h1>
<input type=number value=5></input>s
<button>compute</button>
<pre></pre>
<script>
  var ac = new AudioContext();
  var convolver = new ConvolverNode(ac);

  var $ = document.querySelector.bind(document);

  $("button").onclick = function() {
    var v = parseInt($("input").value);
    var impulseFrames = v * ac.sampleRate;
    var buffer = ac.createBuffer(4, impulseFrames, ac.sampleRate);
    for (var c = 0; c < 4; c++) {
      var b = buffer.getChannelData(c);
      for (var i = 0; i < impulseFrames; i++) {
        b[i] = Math.sin(i); // arbitrary signal
      }
    }
    var start = performance.now();
    convolver.buffer = buffer;
    var time_to_compute = performance.now() - start;
    $("pre").textContent = time_to_compute +
      "ms for an impulse of " + v + "s, " + time_to_compute / v + "ms per seconds of impulse.";
  }
</script>
rtoy commented 5 years ago

Heh. @hoch and I were discussing this issue last week since you sent us (privately) a similar test case. The nice thing about being synchronous is that you know the convolver is ready to process audio with the desired impulse.

We also thought about just "magically" doing it on a separate thread, but then you have no way of knowing when the convolver was ready with the impulse response. Your proposed scheme takes care of that, at the expense of an API change.

We'll see if we can add some new metrics in Chrome about the actual lengths of the impulse response.

Note also that the PannerNode has the same problem (at least in Chrome) when you change from "equalpower" to "HRTF". There's a noticeable delay before the panner changes to HRTF. This is much, much more noticeable on Android or other lower powered devices.

If we go down this path, we should look over all of the setters and maybe change them to use the promise pattern you proposed. That's a pretty big change, though.

padenot commented 5 years ago

Note also that the PannerNode has the same problem (at least in Chrome) when you change from "equalpower" to "HRTF". There's a noticeable delay before the panner changes to HRTF. This is much, much more noticeable on Android or other lower powered devices.

True, we have the same issue of course.

If we go down this path, we should look over all of the setters and maybe change them to use the promise pattern you proposed. That's a pretty big change, though.

There aren't a lot of setters that are that expensive. The PeriodicWave ctor allows unbounded arrays to be passed, but partials culling allows to keep the cost of the computation under control. I think it's just the two you listed, having looked at all the setters.

rtoy commented 5 years ago

That was my guess. Thanks for looking.

For Chrome, the construction of a PeriodicWave is somewhat expensive because it construct 1/3 octave tables from about 1 Hz to Nyquist. That's quite a few (inverse) FFTs that are computed synchronously during the construction of the wave.

karlt commented 5 years ago

I suspect PeriodicWaves of reasonable size and HRTF panning also may be implementation issues. Our implementations initialize tables up front, but only some of their elements are actually used. Lazy initialization of the elements used would make a big difference.

karlt commented 5 years ago

it would be best to not have to be able to initialize a ConvolverNode without having to pay it on the render thread

It's only initialization for the first "~278msec @ 44.1KHz" of the impulse that might actually be performed on a realtime render thread because later stages process on another thread. But perhaps some of that initialization for the first 278msec could be tolerated on the main thread.

padenot commented 5 years ago

AudioWG call:

rtoy commented 4 years ago

TPAC discussion summary:

This makes sense when you set the buffer, after the ConvolverNode has been constructed. But when using the constructor that specifies the buffer, we probably need to block because that will break many things if we don't.

TheRealDannyyy commented 4 years ago

Hello everyone!

I'm a web developer, creating HTML5 games using the Web Audio API's positional audio feature. A few months ago I reported the mentioned delay, that happens when you change the panning model from "Equalpower" to "HRTF". It turned out to be intentional, since the initialization process has been made a-sync to fix performance issues.

Several HTML5 game developers including myself are currently struggling with this initial delay, since there seems to be no way of knowing, when HRTF is actually ready for use. A fairly popular but "hacky" workaround used by most developers right now, is to play a silent audio file and show a fake loading screen for about 10 seconds during the game's startup. As you can imagine, most users don't enjoy waiting that long before playing their games.

A Chromium developer linked me to this issue and from what I could gather, there seems to be work being done towards implementing a return promise when HRTF is ready for use. I'm not aware of how this whole spec update process works but I'd appreciate any approximate date for when this feature will be implemented, if possible.

Cheers!

padenot commented 4 years ago

This is related but slightly different, but I think a similar pattern could be used, in terms of API (setter -> method that takes the value returns a promise, value being either the panning mode or the buffer).

There are no approximate date to give, it depends on too much things:

That said it's not a particularly complicated change, but we have a lot of them in addition to this one, so it's more or less or prioritization problem.

Also because this is different from the convolver, and we have agreement only on the Convolver part, we'll have to discuss this.

padenot commented 3 years ago

TPAC 2020 resolution:

rtoy commented 3 years ago

Virtual F2F 2021: Add the proposed IDL in https://github.com/WebAudio/web-audio-api-v2/issues/28#issuecomment-713714517 and add appropriate text for this.