Closed garko1 closed 4 years ago
Hi Garko, this is correct behaviour as the CT_DIF_FFT_4way does not reorder the elements after calculating Fourier transform which normally required when the Cooley-Tukey FFT algorithm is used. However, for the convolutions through frequency-domain (using convolutional theorem) we can leave out the reordering step as the wrong order after forward FFT will be cancelled out by the inverse FFT step (Forward decimation in frequency FFT element order is cancelled by inverse decimation in time FFT). When you define TESTING it enables reordering step for all FFT calculation which leads to degradation in performance which is the reason why we have left it out. Thanks for your interest in this code. Karel
You can try with these 65 (causal) filter coefficients (in time domain, pad them with zeros to 4096 size):
0.000190, -0.000280, 0.000364, -0.000425, 0.000441, -0.000384, 0.000228, 0.000058, -0.000498, 0.001114, -0.001918, 0.002908, -0.004071, 0.005373, -0.006762, 0.008164, -0.009485, 0.010611, -0.011408, 0.011724, -0.011395, 0.010238, -0.008055, 0.004624, 0.000320, -0.007119, 0.016273, -0.028599, 0.045628, -0.070668, 0.112425, -0.203098, 0.633555, 0.633555, -0.203098, 0.112425, -0.070668, 0.045628, -0.028599, 0.016273, -0.007119, 0.000320, 0.004624, -0.008055, 0.010238, -0.011395, 0.011724, -0.011408, 0.010611, -0.009485, 0.008164, -0.006762, 0.005373, -0.004071, 0.002908, -0.001918, 0.001114, -0.000498, 0.000058, 0.000228, -0.000384, 0.000441, -0.000425, 0.000364, -0.000280
cuFFT and CT_DIF_FFT_4way<4096> do not yield the same results. I have not included code in macro #ifdef TESTING