Optimizations for AudioRenderQuantum when all channels are equal

orottier commented 2 years ago

AudioRenderQuantum has a method modify_channels that modifies each channel in the buffer with the same function. This allows for an optimization when all channels in the buffer are identical. In this case, the map function only needs to be applied once. This is a useful optimization because buffers with equals channels occur very often inside the audio graph. E.g. OscillatorNode -> GainNode -> DestinationNode. The GainNode will typically already upmix the oscillator output to two channels. The gain will then be applied twice but this is what we can optimize out.

The same optimization could theoretically apply for AudioBuffer (e.g. the resample method) but it practise the buffer will never contain channels that are the same Arc. There is no user space function that could get to that situation.

orottier commented 2 years ago

Some initial work at https://github.com/orottier/web-audio-api-rs/commit/92ba3f03d741a95ce2013929c8d3f7548cda1fbc - showing it might not always be worthwhile

b-ma commented 2 years ago

Yup, maybe a bit convoluted if the improvement is small.

One thing I've though about in the same vein, would be to bypass processing when input is silent. I don't think it would work for all nodes, but for many of them it could be quite a simple improvement. I wanted to start a proof of concept on the GainNode but failed to reproduce your flamegraph results in #129 (I think we should see the 7% falling down here). The graph I obtain is very different both in terms of results and layout when I run only the granular bench. Could you share your code and the command you used, so that we are on the same page here?

orottier commented 2 years ago

Yup, maybe a bit convoluted if the improvement is small.

We should rewrite the biquad, iir, stream_destination and many other nodes to use this optimization and see if they benefit from it. My theory is that applying gain is so efficient (32 SIMD instruction on already cached data), that we're only seeing noise right now.

bypass processing when input is silent.

Dealing with silent input would be exiting, definitely. The spec has a whole section on it that we ignore currently! https://www.w3.org/TR/webaudio/#AudioNode-actively-processing I created #147 for this now.

share your code and the command

I used https://github.com/flamegraph-rs/flamegraph on a cpu-optimized (dedicated) 2 CPU droplet on Digital Ocean Change cargo.toml

[profile.release]
debug = true

Then run cargo flamegraph --example some_example

The code I used was the minimal runnable version of the "Granular synthesis" example. I don't have the code anymore, maybe we can start a branch and check our numbers and graphs again?

b-ma commented 2 years ago

Dealing with silent input would be exiting, definitely. The spec has a whole section on it that we ignore currently! https://www.w3.org/TR/webaudio/#AudioNode-actively-processing I created https://github.com/orottier/web-audio-api-rs/issues/147 for this now.

Nice, didn't see that one before

Thanks for the infos for the flamegraph, didn't change the cargo.toml no my side, maybe that's the reason the output is that different, will have another shot

maybe we can start a branch and check our numbers and graphs again?

Yup let's do that!

b-ma commented 2 years ago

Maybe an idea that worth digging: number_of_channels() could somehow return 1 when all channels are equal. This would just allow to "trick" all the nodes without much (maybe even any?) changes in their current implementation and/or relying on the use of a specific method:

1 - Oscillator outputs 1 channels 2 - Graph upmix to 2 channels but channels are equal ((i.e. just cheap clone()) 3 - Gain input.number_of_channels() just pretends it is only 1 channel so the processing only occurs on 1 channel and thus outputs 1 channel 4 - Graph upmix to 2 channels but channels are equal ((i.e. just cheap clone()) 5 - BiquadFilter (or other node which may not use modify_channels) input.number_of_channels() pretends it is only 1 channel so the processing only occurs on 1 channel and it outputs 1 channel 6 - etc.

Even when adding inputs, if the 2 added RenderQuantum only contain clones of the first channel, we know that the result can continue pretending it is 1 channel

Not sure I'm really clear but let me know what you think

Edit: that could also be a new method are_channels_equal() so that in nodes, we can just do something like:

let num_channels = if input.are_channels_equal() { 1 } else { input.number_of_channels() }

which is really a simple change, and we don't lose information

orottier commented 2 years ago

Hey, the idea is nice, but it will break the channel-splitter and merger. And I'm not sure if I like the magic around it. What if a AudioWorkletNode specifies an input channel count of 2, but using this trick, we would only serve 1. It might break. I'll let the idea simmer for a bit, we might be able to work something out. The alternative still being exposing that atrribute are_channels_equal and let gainNode, biquadNode and co just optimize for themselves

b-ma commented 2 years ago

but it will break the channel-splitter and merger.

Yup didn't think about these 2 ones, I guess the magic trick could create other weird edge-case. (edit: another example is the StereoPanner which wouldn't have the same behavior with 2 channels even if they are equal)

Actually thinking a bit more about it since yesterday, I'm also convinced the magic solution is "too magic" and therefore not really good. I would also go for exposing are_channels_equal and let the nodes do their own job with that information when possible.

Also, I think we could improve the AudioRenderQuantum::add method so that we can get, (when possible according to channel_config and I guess it's only mono to stereo mix in Discrete mode, but also probably more than 99% of the use cases), something like:

assert!(a.are_channels_equal())
assert!(b.are_channels_equal())
a.add(b);
assert!(a.are_channels_equal())

This way we could propagate the information down the graph

b-ma commented 2 years ago

For the record, I just made a rapid test there: https://github.com/b-ma/web-audio-api-rs/tree/feature/equal-channels (also implementing #195). I just modified the GainNode and BiquadFilter which were the most straightforward and simple to track which our current bench but... unfortunately no visible change neither...

orottier / web-audio-api-rs

Optimizations for AudioRenderQuantum when all channels are equal #117