Closed orottier closed 1 year ago
I have a branch ready for item 1 at feature/intmap-for-graph But I will have a look at the other items first before deciding if this is worth the hassle of an exotic dependency
I took a stab at Avoid the remove and insert calls of the currently processing node.
@b-ma but it's very tricky. We will need a lot of unsafe code to work around it. Not sure if that would be beneficial for the project. I tried a safe, intermediate solution but it had no benefits: 084d8187f7307ab4ca
I will have another look at Use a specialized container type for the nodes
now
I tried a safe, intermediate solution but it had no benefits: https://github.com/orottier/web-audio-api-rs/commit/084d8187f7307ab4ca96d5ca95749454aa380eb2
Just tested, no sign of improvement neither sorry...
I just wonder if we could not also try to bypass the HashMap
altogether to use some kind of Vec<Option<Node>>
for nodes and delegating to something like https://docs.rs/index-pool/latest/index_pool/ to manage the indexes. Then in the graph parsing we could just retrieve/reinsert the nodes like that let node = self.nodes.swap(index, None)
, or is it silly?
edit Actually I just misread the swap method, would probably need something just similar as you did...
I took a new look at intmap: 6b7f11f80a4fe4
. A very slight performance increase but maybe it's just noise. Also Granular synthesis
seems to regress. I would say, no merge.
Next up, create more flamecharts to look for other optimizations before spending time on this again.
I just re-tested using RefCell<Node>
in the HashMap
and managed to have it working this time (slowly understanding some stuff :). It's there https://github.com/orottier/web-audio-api-rs/compare/main...b-ma:web-audio-api-rs:test/graph-render and the perf improvements are quite good (better than chrome on several test cases :)
before
+ id | name | duration (ms) | Speedup vs. realtime | buffer.duration (s)
- 1 | Baseline (silence) | 26 | 4615.4x | 120
- 2 | Simple source test without resampling (Mono) | 41 | 2926.8x | 120
- 3 | Simple source test without resampling (Stereo) | 55 | 2181.8x | 120
- 4 | Simple source test without resampling (Stereo and positionnal) | 173 | 693.6x | 120
- 5 | Simple source test with resampling (Mono) | 82 | 1463.4x | 120
- 6 | Simple source test with resampling (Stereo) | 116 | 1034.5x | 120
- 7 | Simple source test with resampling (Stereo and positionnal) | 232 | 517.2x | 120
- 8 | Upmix without resampling (Mono -> Stereo) | 46 | 2608.7x | 120
- 9 | Downmix without resampling (Stereo -> Mono) | 44 | 2727.3x | 120
- 10 | Simple mixing (100x same buffer) - be careful w/ volume here! | 1755 | 17.1x | 30
- 11 | Simple mixing (100 different buffers) - be careful w/ volume here! | 1733 | 17.3x | 30
- 12 | Simple mixing with gains | 340 | 352.9x | 120
- 13 | Granular synthesis | 2662 | 2.8x | 7.5
- 14 | Synth (Sawtooth with Envelope) | 3442 | 34.9x | 120
- 15 | Synth (Sawtooth with gain - no automation) | 2778 | 43.2x | 120
- 16 | Synth (Sawtooth without gain) | 1681 | 71.4x | 120
- 17 | Substractive Synth | 423 | 283.7x | 120
- 18 | Stereo panning | 82 | 1463.4x | 120
- 19 | Stereo panning with automation | 82 | 1463.4x | 120
- 20 | Sawtooth with automation | 75 | 1600.0x | 120
- 21 | Stereo source with delay | 210 | 571.4x | 120
after
+ id | name | duration (ms) | Speedup vs. realtime | buffer.duration (s)
- 1 | Baseline (silence) | 21 | 5714.3x | 120
- 2 | Simple source test without resampling (Mono) | 30 | 4000.0x | 120
- 3 | Simple source test without resampling (Stereo) | 44 | 2727.3x | 120
- 4 | Simple source test without resampling (Stereo and positionnal) | 158 | 759.5x | 120
- 5 | Simple source test with resampling (Mono) | 75 | 1600.0x | 120
- 6 | Simple source test with resampling (Stereo) | 106 | 1132.1x | 120
- 7 | Simple source test with resampling (Stereo and positionnal) | 209 | 574.2x | 120
- 8 | Upmix without resampling (Mono -> Stereo) | 39 | 3076.9x | 120
- 9 | Downmix without resampling (Stereo -> Mono) | 35 | 3428.6x | 120
- 10 | Simple mixing (100x same buffer) - be careful w/ volume here! | 1599 | 18.8x | 30
- 11 | Simple mixing (100 different buffers) - be careful w/ volume here! | 1604 | 18.7x | 30
- 12 | Simple mixing with gains | 300 | 400.0x | 120
- 13 | Granular synthesis | 2347 | 3.2x | 7.5
- 14 | Synth (Sawtooth with Envelope) | 2899 | 41.4x | 120
- 15 | Synth (Sawtooth with gain - no automation) | 2212 | 54.2x | 120
- 16 | Synth (Sawtooth without gain) | 1332 | 90.1x | 120
- 17 | Substractive Synth | 414 | 289.9x | 120
- 18 | Stereo panning | 71 | 1690.1x | 120
- 19 | Stereo panning with automation | 73 | 1643.8x | 120
- 20 | Sawtooth with automation | 62 | 1935.5x | 120
- 21 | Stereo source with delay | 201 | 597.0x | 120
The downside is that I didn't manage to get rid of unsafe
code in 2 places. It very localized and seems to be the same problem each time (i.e. returning a reference to the buffer in Graph::render()
and AudioParamValues::get()
) so maybe you would have an idea to handle that?
Amazing, I did not realize there was this much to gain still from the Graph::insert/remove stuff. Let's continue discussing at https://github.com/orottier/web-audio-api-rs/pull/199
I'm closing this issue because I think the leftover point are no longer really interesting, given the current implementation
Some improvements could be made in graph.rs when rendering an audio quantum:
Cell
or equivalentWhen a Node has only one outgoing connection in the graph, its outputs can be moved instead of copied to that Node's inputsthis is not useful because the inputs are immutable anyway and need to be copied/mutated to outputs