ar1st0crat / NWaves

.NET DSP library with a lot of audio processing functions
MIT License
462 stars 71 forks source link

Updated online fir filter code for faster performance #17

Closed bobasaurus closed 5 years ago

bobasaurus commented 5 years ago

I copied the online fir filtering code from Math.NET Filtering and converted it to use single floats. It runs about 50% faster when optimization is enabled and gives the same results in my limited testing. I can't actually compile NWaves on my system, so I'm unable to fully test these changes.

ar1st0crat commented 5 years ago

Thank you! I got this trick with the layout of filter coefficients in memory. It's indeed twice faster and I'm really surprised! It's basically the same code... Yes, it's more cache-friendly, but why would there be such significant impact on performance?? Guess, I'll do the same trick for IIR filters )). I'll keep my names for arrays, though.

Btw, for long kernels (64 and higher) block convolvers should be used as they'll outperform direct convolution (also about 40-50% gain): OlaBlockConvolver / OlsBlockConvolver - they also provide Process() method for online filtering (read more in tutorial). I did some tests as well:

var kernel = DesignFilter.FirWinLp(111, 0.2);
var fir1 = new OlsBlockConvolver(kernel, 512);  // or 1024, 2048, ...
var fir2 = new FirFilter2(kernel);    // improved FIR filter

var sw = new Stopwatch();

sw.Start();

var s1 = new DiscreteSignal(_signal.SamplingRate, _signal.Samples.Select(x => fir1.Process(x)));

sw.Stop();

var t1 = sw.ElapsedTicks;

sw.Reset();
sw.Start();

var s2 = new DiscreteSignal(_signal.SamplingRate, _signal.Samples.Select(x => fir2.Process(x)));

sw.Stop();

var t2 = sw.ElapsedTicks;

MessageBox.Show(t1 + " vs. " + t2);
bobasaurus commented 5 years ago

Wow, you got that done quickly. I'm also not sure why it works so well, I noticed it when testing my old fir filter code vs math.net filtering and saw how much quicker theirs ran. Another interesting quirk is the double float version they use runs almost the same speed as the single float one I derived from their code. You might consider implementing some double precision FIR filters too.

I couldn't figure out how to elegantly insert this update into your code, so I'm glad you implemented it properly. I'll get rid of my fork and use your update instead. Will you publish a new nuget package?

ar1st0crat commented 5 years ago

At the moment I've only merged your commit without any change. Tomorrow (it's midnight right now in my timezone) I'll cleanup things, run final tests and make new commit. Do you need a new nuget package as soon as possible? (I just wanted to add some more planned features, so it'll take some time, but if necessary I'll make a 0.9.2.1 subversion).

As for "float/double" thing - it's a long story )) This lib used to have both versions. In terms of speed there wasn't such a big difference in NWaves too (actually modern compilers and CPUs are able to handle double precision quite well). The main reason I left only float version was the memory consumption. But if there are tasks where precision is completely crucial, then I'll think about getting "64-bit filters" back.

And once again - for long kernels block-convolvers (overlap-save/overlap-add) will run even faster! So consider them as well.

bobasaurus commented 5 years ago

I'm not in a hurry, I'm just experimenting with your library as an alternative to other DSP code. I've been looking for a way to do complex filtering, but not finding any easy libraries that support it.

ar1st0crat commented 5 years ago

I've updated NuGet package (ver.0.9.3).

I did more tests. The results are given below (absolute values are specific to my audio data, but they illustrate the differences). Note, in .NET Core performance gain is much less significant (as I expected in the first place). Moreover, .NET Core realizations are much faster than .NET Framework, for both versions of NWaves.

.NET Framework 4.7:

Ver. \ Order N=5 N=9 N=15 N=21 N=35 N=47
0.9.2 850ms 1050ms 1370ms 1740ms 2800ms 3540ms
0.9.3 685ms 800ms 930ms 1050ms 1355ms 1665ms

.NET Core 2.1:

Ver. \ Order N=5 N=9 N=15 N=21 N=35 N=47
0.9.2 350ms 468ms 683ms 914ms 1490ms 1940ms
0.9.3 350ms 455ms 630ms 760ms 1230ms 1525ms

I've also added 64-bit versions of filters to the code base. Their performance is quite close to 32-bit filters.

bobasaurus commented 5 years ago

Thanks a lot for the update, interesting that .NET Core is so much better. If I had fewer users depending on old systems, I would make the switch.