romeric / Fastor

A lightweight high performance tensor algebra framework for modern C++
MIT License
752 stars 70 forks source link

Question about FIR using inner product #144

Open ghost opened 3 years ago

ghost commented 3 years ago

Hi Roman, sorry to ask you something maybe naïve, but I have add a little class to a FIR Benchmark produce by jatinchowdhury18 repo, you can find the issue here : New promising benchmark using Fastor C++

Fastor outperform the other inner_product implementation except with small kernel size and I'm sure that I don't use Fastor correctly. In the main processing loop (over the sample buffer), I can't call Fastor::inner directly with a subview like that:

buffer[n] = Fastor::inner(z(Fastor::seq(zPtr, zPtr + N)), h);

where N is the templated FIR order, h is the FIR coefficients tensor of the impulse response, z a double-buffer state tensor related of the z-N essence of the FIR equation and buffer the sample buffer that receive the discrete convolution result.

I need to cast the subview like that to allow compilation:

Fastor::Tensor<float, N> zn = z(Fastor::seq(zPtr, zPtr + N, 1));
buffer[n] = Fastor::inner(zn, h);

Even if the method outperform the other method on kernel > 32 (in the benchmark of power of 2), I'm pretty sure that the assignment operator in the main loop is a bottleneck for smaller sizes kernels.

Why can I directly call Fastor::inner with the subview ? What is wrong with my code ?

Thank you very much for you answer and your time !!!