FFTW / fftw3

DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
GNU General Public License v2.0
2.72k stars 661 forks source link

Recover lost SSE2 support on MSVC 64-bit builds; FFTW_VERSION #292

Open olafk1 opened 2 years ago

olafk1 commented 2 years ago

For MSVC, the HAVE_SSE2 preprocessor definition will only be enabled in CMakeLists.txt if the compiler successfully accepts /arch:SSE2 as a command line parameter. However, this applies to MSVC 32-bit compilers only. MSVC 64-bit compilers have the switch /arch:SSE2 removed, since this is already the default minimum architecture supported on all 64-bit CPUs (see Microsoft Docs here).

As a result, fftw DLLs lose all SSE2 support when the common configuration -DENABLE_SSE2=1 -DENABLE_AVX=1 is passed to cmake. This leads to an illegal instruction and a crash during runtime on non-AVX-CPUs.

The following patch should at least re-enable SSE2 support for MSVC 64-bit compilers, although /arch:SSE2 has been removed as the default minimum architecture.

diff --git a/CMakeLists.txt b/CMakeLists.txt
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -194,2 +194,7 @@

+# Activate SSE2 support even if the corresponding compiler flag for 64-bit MSVC has been removed
+if (CMAKE_CL_64 AND ENABLE_SSE2)
+  set (HAVE_SSE2 TRUE)
+endif ()
+
 if (HAVE_SSE2 OR HAVE_AVX)

Side note: the version number in CMakeLists.txt version 3.3.10 wrongly indicates FFTW_VERSION 3.3.9 and needs a brush up.

olafk1 commented 2 years ago

The fix above addresses a significant performance flaw in MSVC non-AVX builds for 64-bit systems. It turned out that very common SSE2/AVX builds (like that from conda-forge) still crash on some older 64-bit CPUs. This is caused by a bigger conceptual bug in CMakeList.txt, which should be discussed in another ticket, because upcoming CPU architectures are now also affected.

egpbos commented 1 year ago

This makes sense. Have the maintainers seen this? @matteo-frigo