rust-av / speexdsp-rs

Speexdsp bindings and pure-rust implementation
MIT License
20 stars 14 forks source link

replacing loops with iterators, unrolling float loop #79

Closed khodzha closed 4 years ago

khodzha commented 4 years ago

not really, bench gives [10.452 ms 10.509 ms 10.618 ms] :man_shrugging:

Luni-4 commented 4 years ago

Looking at the benchmark results, the situation is gotten worse :(

lu-zero commented 4 years ago

Could you write benchmarks for the single function so we can look what happens on it? (also cargo-asm might shed some light)

khodzha commented 4 years ago

i added a simpler benchmark which doesnt change rate of output

i also took a look at C implementation of speexdsp and

results with simpler bench:

resampler_simple_c      time:   [2.0033 ms 2.0225 ms 2.0387 ms]
resampler_simple_rust   time:   [2.7497 ms 2.7794 ms 2.7974 ms] (without unroll and hsum with hadd)
resampler_simple_rust   time:   [2.1840 ms 2.2375 ms 2.3400 ms] (with unroll and hsum with movehl/shuffle)
lu-zero commented 4 years ago

It looks really nice :) there is still some overhead that should go away but it is a fairly good improvement :)

Luni-4 commented 4 years ago

Great @khodzha! Thanks a lot! :)

khodzha commented 4 years ago

results for doubles right now: resampler_simple_rust_dbl [7.7673 ms 7.9603 ms 8.1017 ms] resampler_simple_c_dbl [8.0655 ms 8.3503 ms 8.6117 ms]

khodzha commented 4 years ago

i rebased and squashed commits and marked PR as ready for review if you want to merge it

lu-zero commented 4 years ago

It seems to still have conflicts. great result :)