Closed bobasaurus closed 5 years ago
Thank you! I got this trick with the layout of filter coefficients in memory. It's indeed twice faster and I'm really surprised! It's basically the same code... Yes, it's more cache-friendly, but why would there be such significant impact on performance?? Guess, I'll do the same trick for IIR filters )). I'll keep my names for arrays, though.
Btw, for long kernels (64 and higher) block convolvers should be used as they'll outperform direct convolution (also about 40-50% gain): OlaBlockConvolver
/ OlsBlockConvolver
- they also provide Process()
method for online filtering (read more in tutorial). I did some tests as well:
var kernel = DesignFilter.FirWinLp(111, 0.2);
var fir1 = new OlsBlockConvolver(kernel, 512); // or 1024, 2048, ...
var fir2 = new FirFilter2(kernel); // improved FIR filter
var sw = new Stopwatch();
sw.Start();
var s1 = new DiscreteSignal(_signal.SamplingRate, _signal.Samples.Select(x => fir1.Process(x)));
sw.Stop();
var t1 = sw.ElapsedTicks;
sw.Reset();
sw.Start();
var s2 = new DiscreteSignal(_signal.SamplingRate, _signal.Samples.Select(x => fir2.Process(x)));
sw.Stop();
var t2 = sw.ElapsedTicks;
MessageBox.Show(t1 + " vs. " + t2);
Wow, you got that done quickly. I'm also not sure why it works so well, I noticed it when testing my old fir filter code vs math.net filtering and saw how much quicker theirs ran. Another interesting quirk is the double float version they use runs almost the same speed as the single float one I derived from their code. You might consider implementing some double precision FIR filters too.
I couldn't figure out how to elegantly insert this update into your code, so I'm glad you implemented it properly. I'll get rid of my fork and use your update instead. Will you publish a new nuget package?
At the moment I've only merged your commit without any change. Tomorrow (it's midnight right now in my timezone) I'll cleanup things, run final tests and make new commit. Do you need a new nuget package as soon as possible? (I just wanted to add some more planned features, so it'll take some time, but if necessary I'll make a 0.9.2.1 subversion).
As for "float/double" thing - it's a long story )) This lib used to have both versions. In terms of speed there wasn't such a big difference in NWaves too (actually modern compilers and CPUs are able to handle double precision quite well). The main reason I left only float version was the memory consumption. But if there are tasks where precision is completely crucial, then I'll think about getting "64-bit filters" back.
And once again - for long kernels block-convolvers (overlap-save/overlap-add) will run even faster! So consider them as well.
I'm not in a hurry, I'm just experimenting with your library as an alternative to other DSP code. I've been looking for a way to do complex filtering, but not finding any easy libraries that support it.
I've updated NuGet package (ver.0.9.3).
I did more tests. The results are given below (absolute values are specific to my audio data, but they illustrate the differences). Note, in .NET Core performance gain is much less significant (as I expected in the first place). Moreover, .NET Core realizations are much faster than .NET Framework, for both versions of NWaves.
.NET Framework 4.7:
Ver. \ Order | N=5 | N=9 | N=15 | N=21 | N=35 | N=47 |
---|---|---|---|---|---|---|
0.9.2 | 850ms | 1050ms | 1370ms | 1740ms | 2800ms | 3540ms |
0.9.3 | 685ms | 800ms | 930ms | 1050ms | 1355ms | 1665ms |
.NET Core 2.1:
Ver. \ Order | N=5 | N=9 | N=15 | N=21 | N=35 | N=47 |
---|---|---|---|---|---|---|
0.9.2 | 350ms | 468ms | 683ms | 914ms | 1490ms | 1940ms |
0.9.3 | 350ms | 455ms | 630ms | 760ms | 1230ms | 1525ms |
I've also added 64-bit versions of filters to the code base. Their performance is quite close to 32-bit filters.
Thanks a lot for the update, interesting that .NET Core is so much better. If I had fewer users depending on old systems, I would make the switch.
I copied the online fir filtering code from Math.NET Filtering and converted it to use single floats. It runs about 50% faster when optimization is enabled and gives the same results in my limited testing. I can't actually compile NWaves on my system, so I'm unable to fully test these changes.