FFTW / fftw3

DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
GNU General Public License v2.0
2.66k stars 651 forks source link

`--enable-generic-simd256` causes memory error on `fftw_plan_many_dft_r2c` and `fftw_plan_many_dft_c2` #328

Open maxmarsc opened 1 year ago

maxmarsc commented 1 year ago

First of all I think this issue might be related to :

I compiled both fftw and fftwf 3.3.10 for x86_64, using GCC 9, with the following flags :

    --enable-avx
    --enable-avx2
    --enable-avx512
    --enable-avx-128-fma
    --enable-generic-simd128
    --enable-generic-simd256

The issue I identified only happened with fftw (not fftwf)

The code to reproduce the bug would be :

#include "fftw3.h"
#include <stdlib.h>
#include <cmath>

int main() {
  int fft_size       = 256;
  int channels       = 1;
  int transform_size = std::floor(fft_size / 2) + 1;

  double* inplace_work_buffer = fftw_alloc_real(channels * transform_size * 2);

  int rank    = 1;          /* we are computing 1d transforms */
  int n[]     = {fft_size}; /* 1d transforms of length fftTransformSize */
  int howmany = channels;   /* how many transforms to compute */
  int idist   = transform_size * 2;
  int odist = transform_size;
  int istride  = 1;
  int ostride  = 1;
  int* inembed = nullptr;
  int* onembed = nullptr;

  auto* plan = fftw_plan_many_dft_r2c(rank, n, howmany, inplace_work_buffer, inembed,
                         istride, idist,
                         reinterpret_cast<fftw_complex*>(inplace_work_buffer),
                         onembed, ostride, odist, FFTW_MEASURE);
  fftw_destroy_plan(plan);
  fftw_free(inplace_work_buffer);
}

When running with ASan, here is the output it gives :

=================================================================
==1185224==ERROR: AddressSanitizer: unknown-crash on address 0x612000000430 at pc 0x5629290259c4 bp 0x7ffdc0967a50 sp 0x7ffdc0967a40
READ of size 32 at 0x612000000430 thread T0
    #0 0x5629290259c3 in LDA /foo/bar/build/source/stft/fftwf/src/fftwf/simd-support/simd-generic256.h:60
    #1 0x562929026df1 in n2fv_16 /foo/bar/build/source/stft/fftwf/src/fftwf/dft/simd/generic-simd256/../common/n2fv_16.c:284
    #2 0x56292936dcd6 in apply_extra_iter /foo/bar/build/source/stft/fftwf/src/fftwf/dft/direct.c:111
    #3 0x562927eef746 in fftw_dft_solve /foo/bar/build/source/stft/fftwf/src/fftwf/dft/solve.c:29
    #4 0x562927edeb8c in measure /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/timer.c:136
    #5 0x562927eded07 in fftw_measure_execution_time /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/timer.c:159
    #6 0x562927ed9376 in evaluate_plan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:460
    #7 0x562927ed9cd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #8 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #9 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #10 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #11 0x562927edd7bf in fftw_mkplan_f_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:986
    #12 0x562927eeb443 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/indirect.c:206
    #13 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #14 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #15 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #16 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #17 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #18 0x562927edd7bf in fftw_mkplan_f_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:986
    #19 0x562929355373 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/buffered.c:199
    #20 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #21 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #22 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #23 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #24 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #25 0x5629293746c1 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:198
    #26 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #27 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #28 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #29 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #30 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #31 0x5629293727b5 in mkcldw /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c-direct.c:334
    #32 0x56292937409c in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:173
    #33 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #34 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #35 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #36 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #37 0x562927ed358c in mkplan0 /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:42
    #38 0x562927ed35db in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:56
    #39 0x562927ed39ca in fftw_mkapiplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:124
    #40 0x562927ed60a9 in fftw_plan_many_dft_r2c /foo/bar/build/source/stft/fftwf/src/fftwf/api/plan-many-dft-r2c.c:41
    #41 0x5629267f1666 in CATCH2_INTERNAL_TEST_4 /foo/bar/tests/fft_tests.cc:55
    #42 0x56292688a6bd in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
    #43 0x56292687e866 in Catch::TestCaseHandle::invoke() const (/foo/bar/build/tests/libstft_tests+0x269866)
    #44 0x56292687d9bb in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:508
    #45 0x56292687d6f5 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:473
    #46 0x56292687bfde in Catch::RunContext::runTest(Catch::TestCaseHandle const&) src/catch2/internal/catch_run_context.cpp:238
    #47 0x562926828373 in execute src/catch2/catch_session.cpp:110
    #48 0x5629268297b3 in Catch::Session::runInternal() src/catch2/catch_session.cpp:332
    #49 0x5629268292cc in Catch::Session::run() src/catch2/catch_session.cpp:263
    #50 0x5629268211e6 in int Catch::Session::run<char>(int, char const* const*) src/catch2/../catch2/catch_session.hpp:41
    #51 0x5629268210d4 in main src/catch2/internal/catch_main.cpp:36
    #52 0x7fe9cf443082 in __libc_start_main ../csu/libc-start.c:308
    #53 0x5629267f02bd in _start (/foo/bar/build/tests/libstft_tests+0x1db2bd)

0x612000000440 is located 0 bytes to the right of 256-byte region [0x612000000340,0x612000000440)
allocated by thread T0 here:
    #0 0x7fe9cfa6b005 in __interceptor_memalign ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:169
    #1 0x562927ed67ea in fftw_kernel_malloc /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/kalloc.c:91
    #2 0x562927ed6548 in fftw_malloc_plain /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/alloc.c:28
    #3 0x5629293550b9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/buffered.c:196
    #4 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #5 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #6 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #7 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #8 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #9 0x5629293746c1 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:198
    #10 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #11 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #12 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #13 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #14 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #15 0x5629293727b5 in mkcldw /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c-direct.c:334
    #16 0x56292937409c in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:173
    #17 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #18 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #19 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #20 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #21 0x562927ed358c in mkplan0 /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:42
    #22 0x562927ed35db in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:56
    #23 0x562927ed39ca in fftw_mkapiplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:124
    #24 0x562927ed60a9 in fftw_plan_many_dft_r2c /foo/bar/build/source/stft/fftwf/src/fftwf/api/plan-many-dft-r2c.c:41
    #25 0x5629267f1666 in CATCH2_INTERNAL_TEST_4 /foo/bar/tests/fft_tests.cc:55
    #26 0x56292688a6bd in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
    #27 0x56292687e866 in Catch::TestCaseHandle::invoke() const (/foo/bar/build/tests/libstft_tests+0x269866)
    #28 0x56292687d9bb in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:508
    #29 0x56292687d6f5 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:473

SUMMARY: AddressSanitizer: unknown-crash /foo/bar/build/source/stft/fftwf/src/fftwf/simd-support/simd-generic256.h:60 in LDA
Shadow bytes around the buggy address:
  0x0c247fff8030: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c247fff8040: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c247fff8050: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  0x0c247fff8060: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c247fff8070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c247fff8080: 00 00 00 00 00 00[00]00 fa fa fa fa fa fa fa fa
  0x0c247fff8090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c247fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c247fff80b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c247fff80c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c247fff80d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1185224==ABORTING

And valgrind --leak-check=full gives me :

==1280516== Memcheck, a memory error detector
==1280516== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1280516== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==1280516== Command: ./build/tests/libstft_tests bug\ report
==1280516== 
==1280516== Invalid read of size 8
==1280516==    at 0x21B279A: LDA (simd-generic256.h:60)
==1280516==    by 0x21B36C4: n2fv_16 (n2fv_16.c:284)
==1280516==    by 0x24920C3: apply_extra_iter (direct.c:111)
==1280516==    by 0x13B8A3E: fftw_dft_solve (solve.c:29)
==1280516==    by 0x13B13B6: measure (timer.c:136)
==1280516==    by 0x13B1468: fftw_measure_execution_time (timer.c:159)
==1280516==    by 0x13AF1DA: evaluate_plan (planner.c:460)
==1280516==    by 0x13AF4E3: search0 (planner.c:529)
==1280516==    by 0x13AF695: search (planner.c:600)
==1280516==    by 0x13AFAB3: mkplan (planner.c:711)
==1280516==    by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516==    by 0x13B088B: fftw_mkplan_f_d (planner.c:986)
==1280516==  Address 0x4fcc900 is 0 bytes after a block of size 256 alloc'd
==1280516==    at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1280516==    by 0x13AE134: fftw_kernel_malloc (kalloc.c:91)
==1280516==    by 0x13ADFFB: fftw_malloc_plain (alloc.c:28)
==1280516==    by 0x24858DC: mkplan (buffered.c:196)
==1280516==    by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516==    by 0x13AF45B: search0 (planner.c:529)
==1280516==    by 0x13AF695: search (planner.c:600)
==1280516==    by 0x13AFAB3: mkplan (planner.c:711)
==1280516==    by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516==    by 0x2494DE4: mkplan (ct-hc2c.c:198)
==1280516==    by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516==    by 0x13AF45B: search0 (planner.c:529)
==1280516== 
==1280516== Invalid read of size 8
==1280516==    at 0x21B279E: LDA (simd-generic256.h:60)
==1280516==    by 0x21B36C4: n2fv_16 (n2fv_16.c:284)
==1280516==    by 0x24920C3: apply_extra_iter (direct.c:111)
==1280516==    by 0x13B8A3E: fftw_dft_solve (solve.c:29)
==1280516==    by 0x13B13B6: measure (timer.c:136)
==1280516==    by 0x13B1468: fftw_measure_execution_time (timer.c:159)
==1280516==    by 0x13AF1DA: evaluate_plan (planner.c:460)
==1280516==    by 0x13AF4E3: search0 (planner.c:529)
==1280516==    by 0x13AF695: search (planner.c:600)
==1280516==    by 0x13AFAB3: mkplan (planner.c:711)
==1280516==    by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516==    by 0x13B088B: fftw_mkplan_f_d (planner.c:986)
==1280516==  Address 0x4fcc908 is 8 bytes after a block of size 256 alloc'd
==1280516==    at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1280516==    by 0x13AE134: fftw_kernel_malloc (kalloc.c:91)
==1280516==    by 0x13ADFFB: fftw_malloc_plain (alloc.c:28)
==1280516==    by 0x24858DC: mkplan (buffered.c:196)
==1280516==    by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516==    by 0x13AF45B: search0 (planner.c:529)
==1280516==    by 0x13AF695: search (planner.c:600)
==1280516==    by 0x13AFAB3: mkplan (planner.c:711)
==1280516==    by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516==    by 0x2494DE4: mkplan (ct-hc2c.c:198)
==1280516==    by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516==    by 0x13AF45B: search0 (planner.c:529)
==1280516== 
==1280516== 
==1280516== HEAP SUMMARY:
==1280516==     in use at exit: 226,376 bytes in 2,457 blocks
==1280516==   total heap usage: 58,871 allocs, 56,414 frees, 34,196,978 bytes allocated
==1280516== 
==1280516== LEAK SUMMARY:
==1280516==    definitely lost: 0 bytes in 0 blocks
==1280516==    indirectly lost: 0 bytes in 0 blocks
==1280516==      possibly lost: 0 bytes in 0 blocks
==1280516==    still reachable: 226,376 bytes in 2,457 blocks
==1280516==         suppressed: 0 bytes in 0 blocks
==1280516== Reachable blocks (those to which a pointer was found) are not shown.
==1280516== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1280516== 
==1280516== For lists of detected and suppressed errors, rerun with: -s
==1280516== ERROR SUMMARY: 16 errors from 2 contexts (suppressed: 0 from 0)

Note : you can see in the stack that I'm using catch2 rather than having the code inside a main function, but using a main function would reproduce the issue

Some more details I gathered