Add FFT functionality in CUDA

The FFT version of the matrix multiplier needs to be implemented for CUDA also. Steps: -Implement FFT and IFFT on CUDA (in C) -Define an interface for passing complex (32 bit precision) arrays between FORTRAN and C (CUDA). -Verify against the solution implemented for the CPU in Fortran.

Math: We can use FFT to remove a portion of the matrix entries in Fourier-space, which can lead to a better performance: H = K M => H = IFFT( FFT(K) FFT(M) ), with the understanding that FFT(K) needs to be done once and then can be made into a sparse matrix by removing values at a certain level. At each computational step we then need to Fourier-transform M, do the matrix-multiplication in Fourier-space and then do the inverse Fourier transform to get the result. The intention is that in Fourier-space the K matrix is close to being diagonal thus high efficiency can be gained.

cmt-dtu-energy / MagTense

Add FFT functionality in CUDA #8