Closed brandy125 closed 3 weeks ago
Can anybody explain or improve that?
explain ? yes ! improve ? no !
for me the situation is limpid : 16087 is a "large" prime number, then it cannot be splitted into a product of small prime numbers 2$^n$ 3$^m$ ...
you can read in the doc of FFTw Besides the automatic performance adaptation performed by the planner, it is also possible for advanced users to customize FFTW for their special needs. As distributed, FFTW works most efficiently for arrays whose size can be factored into small primes (2, 3, 5, and 7), and uses a slower general-purpose routine for other factors.
In the IDL doc. this point is also discussed (decomposition in prime factors versus speed)
Years ago, processing very long TOI (time series) for Planck HFI, we decided to cut into the longest 2$^n$ 3$^m$ 5$^o$ just below what we want to process ... I may still have a code around ... Maybe in the idlastro too !
understand
Shouldn't we try to catch that up? If we know that prime numbers will take 300 times longer in that case than the next round number, isn't there any way to do the fft on bigger arrays and transfer the result back to the shorter array?
I would be reluctant to depart for what the reference is providing to us ! No change by default to FFT() for me !
In order to make a warning, I don't know if it is easy to know whether a number is prime or not, or provide a suggestion of adequate numbers. We may consider to add an extra procedure to check that before. In some part of the TOI processing for Planck HFI it was done "automatically". It was the "responsibility" of the team to manage that and provide clear information to third users (take care of side effect, aliasing, ...)
Concerning warnings, we tried to add some in some places in GDL but it was not good idea : in case of pipeline or multiple calls, too much messages.
During long time we (wrongly) repeated that FFT is fast only for power of two and most of my colleagues never realized that a combination of 2, 3, 5 ... till 11 is fine. (clearly written in some docs !) We have to let know to students & news users they have to consider if they are in a good situation or not !
Last point : it is really super easy to superseded a function in GDL, without loose of performance you can call FFT() through something like MYFFT() or FFT2() and put the checks in that code, without changing the c++ code of FFT
Understand. On the other hand how can IDL do it so fast?
bash$ fl
Fawlty Language 0.79.54 (linux aarch64 m64) Copyright (c) 2000-2024, Lajos Foldy
https://www.flxpert.hu/fl/
*** THIS SOFTWARE COMES WITH ABSOLUTELY NO WARRANTY! USE AT YOUR OWN RISK! ***
Type 'help, /lib' for system routines info.
FL> a=complexarr(400,16087)
FL> b=fft(a,dim=2)
FL> tic & b=fft(a,dim=2) & toc
% Time elapsed: 0.328576 seconds.
FL>
Ah, I'm afraid that DIM=2 is the problem. The FFTs in GDL use fftw only when DIM is not present. For some reason the use of fftw was introduced only in the basic case. With DIM=xxx it reverts to and old slow fourier transform code.
GDL> a=complexarr(400,16087)
GDL> tic & b=fft(a) & toc
% Time elapsed: 0.34985018 seconds.
In retrospect, I'm surprised nobody ever complained --- probably people do 'whole' FFTs, not subsets. Or on too small arays to make a difference. In retrospect too, FFTW did not provide directly that functionality at the time (FFT of an n-dim array's slice). It does now, but the code is awkward and as undocumented as can be written by the guy who did it. I look at it.
BTW, the slow code uses the GSL fft and I'm surprised the GSL can be so slow---probably our fault.
GSL was not the problem here.
GDL> a=complexarr(400,16087) &tic & b=fft(a,dim=2) & toc % Time elapsed: 0.37771702 seconds. PR coming.
should work with #1897
The fft for an complex array of size (400,16087) takes 97 seconds while a slightly bigger array of size (400,16095) takes only 0.3 seconds
Can anybody explain or improve that?
GDL - GNU Data Language, Version v1.0.6-25-g67b3abca