Closed DragonDlut closed 2 months ago
You won't in general get linear speedup from FFTs. For that matter, for any algorithm, there will be a point of diminishing returns where adding more threads will not make things faster (or will even make things slower).
That being said, try an FFTW_MEASURE
plan instead of FFTW_ESTIMATE
Thanks for your kind reply! According to my understanding, to use the OpenMP version of FFTW, the only difference is to add
call dfftw_init_threads(ierr)
if(ierr==0) then
write(*,*) "Error in Parallel FFT Initialization!"
stop
end if
nthreads=omp_get_max_threads()
call dfftw_plan_with_nthreads(nthreads)
to the original code and the remaining
call dfftw_plan_dft_2d_(fft_plan_forward , fft_nx_extent,fft_ny_extent, fft_cval, fft_kval, FFTW_FORWARD , FFTW_ESTIMATE )
and
call dfftw_execute_(fft_plan_forward)
is the same. Whether such correction to the code is right and enough to drive the parallel FFTW?
Another question, to use FFTW on HPC cluster, I have copied my desktop-compiled libfftw3.a and libfftw3_omp.a to the remote cluster and linked them with
gfortran -fopenmp FFT/libfftw3.a FFT/libfftw3_omp.a ....
Whether it is enough, or I must re-compile FFTW on the HPC?
Thank you for your help!
Longfei
Hi,
I am writing a spectral method code to resolve the free surface flow and plan to use FFTW for FFT. To improve the efficiency, I have noticed its OpenMP version. According to the manual, I have configured it as attached, but noticed that with 8 threads, the FFT time-consuming can be only 1/2. I feel confused about it and being here for some help.
The shell-script and the fft module have been attached. To continue my research, I really need your help for my possible misunderstanding about OpenMP-FFTW.
Thank you in advance.
Longfei
files.zip