Open dfarns opened 6 years ago
Some printf statements in mkapiplan, mkplan, etc are suggesting that the wisdom generated by the planner is considered "bogus" in mkplan() [i.e., (plnr->wisdom_state == WISDOM_IS_BOGUS) is true] and thus rejected. This would explain the empty wisdom string returned by fftw_export_wisdom as well as the planner reversion to FFTW_ESTIMATE rigor.
Any reason why this might happen only when avx512 codelets are enabled and nthreads>1? Possible race condition / wisdom table corruption?
Probably fixed in ebde7c4e4607afb6bbba7e6609fae56ff0fda01b, can you verify?
I can verify that the simple reproducer I posted no longer exhibits the deviant behavior. Full validation will take some time.
When planning for multiple threads with AVX512 enabled (e.g., skylake-avx512 or knl), fftw_export_wisdom_to_string returns only wisdom header and footer. This occurs when FFTW_MEASURE or FFTW_PATIENT is used; FFTW_ESTIMATE returns wisdom as expected.
This behavior is not observed for single thread planning with AVX512, nor for threaded planning on skylake-avx512 cpus when AVX512 is omitted from the build (e.g., configured using --enable-avx2 and --enable-openmp only).
The library build was configured with minimal options (e.g., --enable-avx512 --enable-openmp) and built with gcc 6.1.0 and 8.1.0. Adding the recommended --enable-avx2 does not help. This occurs for each of the libfftw3_omp and libfftw3_threads (built with --enable-threads) libs. Built and tested on centos7 and sles12 with same behavior.
** NOTE: The plans returned with FFTW_MEASURE and FFTW_PATIENT are the same as that for FFTW_ESTIMATE, suggesting that the planner is finding no applicable (or none at all) wisdom and reverting to the FFTW_ESTIMATE behavior.
Sample output for a N=1024 C2C in-place transform (using FFTW_MEASURE):
Build script:
Reproducer:
Here is the same test on a skylake-avx512 cpu with the library configured with avx2 only and run with FFTW_MEASURE (demonstrating that it's unlikely to be a cpu issue):
Ideas?