Closed IeiuniumLux closed 4 years ago
As a work around, can the tikhonov_filter
use the wisdom feature instead of the _Xfftn
function?
It allows optimized transforms to be stored and recalled.
If I run the attached
test.ipynb
notebook file (remove .txt) on a Google Colab system or the counterparttest.py
on a local AMD/Intel system with a GeForce GTX 1660 installed, then all calls to thetikhonov_filter
function complete in less than a second. However, if I run the sametest.py
on a Jetson TX2, then the first time thetikhonov_filter
is called, it takes more than 90 seconds for the _Xfftn function function to return. Surprisingly, the subsequent calls are completed under a second. Does any know why this only happens on an arm64 architecture?I have built pyFFTW 0.12.0 from source as well as installed via pip3, but the same result.
Presumably the long delay is due to the FFTW wisdom being computed on the first call and then the cached wisdom being used on subsequent calls. It's strange, though, that this is so much slower on the TX2. I would suggest submitting an issue with PyFFTW.
As a work around, can the
tikhonov_filter
use the wisdom feature instead of the_Xfftn
function?It allows optimized transforms to be stored and recalled.
tikhonov_filter
uses the FFT functions in sporco.linalg
, which currently use the pyfftw numpy interface. I've considered replacing this interface with the more general pyfftw interface that allows access to the wisdom, but I'm afraid it's pretty far down the ToDo list at this point.
Then, I'll close this here and open an issue with PyFFTW as suggested. Thanks @bwohlberg.
If I run the attached
test.ipynb
notebook file (remove .txt) on a Google Colab system or the counterparttest.py
on a local AMD/Intel system with a GeForce GTX 1660 installed, then all calls to thetikhonov_filter
function complete in less than a second. However, if I run the sametest.py
on a Jetson TX2, then the first time thetikhonov_filter
is called, it takes more than 90 seconds for the _Xfftn function function to return. Surprisingly, the subsequent calls are completed under a second. Does any know why this only happens on an arm64 architecture?I have built pyFFTW 0.12.0 from source as well as installed via pip3, but the same result.
profile_trace.txt test.ipynb.txt test.py.txt