Open fhchl opened 4 years ago
Thanks for the request. This would be great to have, the reason it hasn't been done is that it would rely upon:
scipy.linalg.cython_blas
and scipy.linalg.cython_lapack
as a means of obtaining the addresses of BLAS and LAPACK functions so as to implement numpy.linalg.*
, similar would be needed for FFT functions.Hi @stuartarchibald , Is there any progress or workaround? want to release GIL while applying STFT to lots of audios, thanks.
@vzxxbacq There's no update I'm afraid. May be worth checking if the Python wrappers around FFT libraries you want to use are holding the GIL and raise issues with them?
Is there a workaround or any update on this? Just want to chime in and say I'd find fft support very valuable.
I stumbled upon this thread having the same problem everyone else has: I want to use fft
's routines in my numba functions.
What I found is the following (maybe useful for others). If the fft
is only a small part of the function, object mode can be used as a good workaround.
A sketch of code would look like:
@njit
def myfunc(input)
.....
.....
with objmode(out='complex128[:]'):
out = np.fft.fft(in)
# Maybe also out = np.fft.fftshift(out)
.....
# heavy computations
return ...
In my specific case, this is still significantly faster than a pure implementation. Of course, it would be great to have first-class support for these functions.
Why not build and implementation of fft/ifft within numba itself so that the issues with outside dependencies are gone?
Why not build and implementation of fft/ifft within numba itself so that the issues with outside dependencies are gone?
It'd be very hard to write something that's competitive with highly tuned and optimised FFT libraries, e.g. MKL FFT or FFTW.
@stuartarchibald That makes sense. But, batch fft is a common operation. What is the recommended way to use ffts with numba?
@stuartarchibald That makes sense. But, batch fft is a common operation. What is the recommended way to use ffts with numba?
Probably like this: https://github.com/numba/numba/issues/5864#issuecomment-690838747 ? The chances are your FFT operations are expensive and run multithreaded, the cost of jumping back into the interpreter for doing this is probably small in comparison to the amount of work.
@stuartarchibald I am doing a lot of small FFTs on overlapped time windows. I have implemented the entire code in CUDA but am trying to create a "teaching" code in python. I suppose there is another way to write the code to batch this out but I'd prefer to keep the code as clean and readable as possible.
Re-posting from gitter https://gitter.im/numba/numba?at=6347db962a06f4566b341e00
It's worth looking at how JAX provide fft.
It'd be very hard to write something that's competitive with highly tuned and optimised FFT libraries, e.g. MKL FFT or FFTW.
I don't think that is true anymore. FFTPACK, which had very poor performance, is gone. pypocketfft )written in C++) has performance that's competitive. It's used in scipy.fft
; numpy.fft
uses an older and slightly slower C version of the same package. Now that NumPy has C++ code in it, numpy.fft
could be upgraded to the same version as SciPy uses.
So I guess what remains is that you need a cython_fft
?
It'd be very hard to write something that's competitive with highly tuned and optimised FFT libraries, e.g. MKL FFT or FFTW.
I don't think that is true anymore. FFTPACK, which had very poor performance, is gone. pypocketfft )written in C++) has performance that's competitive. It's used in
scipy.fft
;numpy.fft
uses an older and slightly slower C version of the same package. Now that NumPy has C++ code in it,numpy.fft
could be upgraded to the same version as SciPy uses.
Thanks for the update @rgommers, seems like parts of the original issue may not be a problem any more.
So I guess what remains is that you need a
cython_fft
?
Yes, I think cython
or ctypes
bindings to the same C/C++ functions that NumPy calls would likely be most/all of what is needed to implement this.
I'd probably want to treat that the same way as linalg
: have scipy.fft
provide a superset of functionality compared to numpy.fft
, using the same pypocketfft C++ code under the hood. And then have a scipy.fft.cython_fft
rather than do that in NumPy
I'd probably want to treat that the same way as
linalg
: havescipy.fft
provide a superset of functionality compared tonumpy.fft
, using the same pypocketfft C++ code under the hood. And then have ascipy.fft.cython_fft
rather than do that in NumPy
This seems like a good approach in terms of what would be needed with respect to the different functionality in NumPy vs. SciPy and is consistent with the existing cython
exports (BLAS/LAPACK and scipy.special
). The only immediate issue I can foresee with adding this to SciPy would be that it increases Numba's reliance on SciPy. Numba currently uses the SciPy BLAS/LAPACK cython
bindings to provide the routines needed for numpy.linalg
, however, recent experiments have indicated that this could be replaced by finding the same functions in process from NumPy libraries. IIRC most of the reason for exploring reducing Numba's reliance on SciPy is because there are (somewhat rare) occasions when SciPy is not available, i.e. packages not built yet, or not available for some arch/OS etc. I'll raise this at the next Numba public meeting.
I implemented numpy.fft
and scipy.fft
for Numba using the C++ PocketFFT library some time ago (https://github.com/styfenschaer/rocket-fft). As of version 0.2.0 it also implements the lesser known functions like scipy.fft.fht
and I am confident to say that it runs stable and is well tested.
I am trying to make this threat active again.
I am currently writing an implementation for Particle Image Velocimetry that uses the wrappers of @styfenschaer. They work very well and for larger problems in my application result in a 5x speedup.
It would make sense to integrate these solutions in the main numba
code for maintenance reasons. Is this possible?
Feature request
It would be amazing if numba would support the FFT pack of numpy. Personally, I would be interested in
Obviously, there are many applications in signal processing that could benefit from this.
Are there any plans for that?