FFTW / fftw3

DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
GNU General Public License v2.0
2.72k stars 661 forks source link

Disable SIMD at runtime? #31

Open dfarns opened 9 years ago

dfarns commented 9 years ago

Is there a way to disable, e.g., AVX, during plan creation, i.e., have the planner ignore specific SIMD codelets during plan generation?

matteo-frigo commented 9 years ago

You can pass the FFTW_UNALIGNED flag to the planner. This will disable all SIMD extensions, however. There is no way to disable, e.g., AVX while keeping SSE2

On Tue, Feb 10, 2015 at 11:05 AM, dfarns notifications@github.com wrote:

Is there a way to disable, e.g., AVX, during plan creation, i.e., have the planner ignore specific SIMD codelets during plan generation?

— Reply to this email directly or view it on GitHub https://github.com/FFTW/fftw3/issues/31.

dfarns commented 9 years ago

Please consider this an RFE :) It seems that the micro-benchmarks of the planner may not translate to optimal performance during full problem execution, and performance using certain SIMD codelets may be worse (perhaps due to cpu throttling, turbo, etc?). This is important for recent and future Intel CPUs.

stevengj commented 9 years ago

Actually, FFTW_UNALIGNED plans could potentially still use SIMD for operations on buffers, although I don't recall if this actually happens. Instead, you can pass the undocumented FFTW_NO_SIMD flag to disable SIMD reliably. (We use this for benchmarking.)

dfarns commented 9 years ago

What is desired is the ability to disable (or, conversely, limit to) specific SIMD codelets during the planning stage, e.g., FFTW_SIMD_DISABLE_AVX2 or something to that effect, leaving other SIMD codelets like SSE2 or AVX in the mix. One might chain them together as with other planner flags, e.g., FFTW_ESTIMATE | FFTW_SIMD_DISABLE_AVX2. Alternatively, one might target specific SIMD with something like FFTW_ESTIMATE | FFTW_SIMD_ONLY_SSE2.

stevengj commented 9 years ago

You could create a custom conf.c file (much like the output of fftw-wisdom-to-conf) that only includes the codelets you want.