ludvigak / FINUFFT.jl

Julia interface to the nonuniform FFT library FINUFFT
Other
33 stars 9 forks source link

FFTW crash after using FINUFFT #61

Open AaronGhost opened 1 month ago

AaronGhost commented 1 month ago

Hi, thanks for putting together the bindings for FINUFFT in julia.

I obtain a crash when calling FFTW functions after some FINUFFT ones in Windows when using multi-threading. I can't trigger the same bug in Linux or in single-thread. I can't trigger the bug without running FINUFFT functions before.

I reduced the code to the best of my ability:

using FINUFFT
using FFTW

function nufftstep(data)
    image = FINUFFT.nufft2d1(
        rand(Float32, size(data)),
        rand(Float32, size(data)),
        data,
        1,
        1f-3,
        360,
        360;
        debug=true
    )
    return image
end

function fftstep(data)
    for i in axes(data, 1)
        ifft(data[i, :])
    end
end

nufftstep(rand(ComplexF32, (360,)))
fftstep(rand(ComplexF32, 2000, 360))

The crash report looks like this:

[finufftf_makeplan] new plan: FINUFFT version 2.2.0 .................
[finufftf_makeplan] 2d1: (ms,mt,mu)=(360,360,1) (nf1,nf2,nf3)=(720,720,1)
               ntrans=1 nthr=16 batchSize=1
[finufftf_makeplan] kernel fser (ns=4):         0.00161 s
[finufftf_makeplan] fwBatch 0.00GB alloc:       1.41e-05 s
[finufftf_makeplan] FFTW plan (mode 64, nthr=16):       0.00377 s
[finufftf_setpts] sort (didSort=1):             2.11e-05 s
[finufftf_execute] start ntrans=1 (1 batches, bsize=1)...
[finufftf_execute] done. tot spread:            0.00209 s
               tot FFT:                         0.00717 s
               tot deconvolve:                  0.000363 s

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x6397551c -- .text at C:\Users\anon\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3f-3.dll (unknown line)
in expression starting at path\to\reproducer.jl:25
.text at C:\Users\anon\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3f-3.dll (unknown line)
#2 at C:\Users\anon\.julia\packages\FFTW\6nZei\src\providers.jl:58
unknown function (ip: 00000171f4fc2d6c)
jl_apply at C:/workdir/src\julia.h:2156 [inlined]
start_task at C:/workdir/src\task.c:1202
Allocations: 3547075 (Pool: 3546936; Big: 139); GC: 11

I have installed both FFTW (1.8.0) and FINUFFT (3.2.0) through Pkg. Version info (I can reproduce under 1.10 and 1.11)

Commit 34c3a63147 (2024-07-29 06:24 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 16 × 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, rocketlake)
Threads: 8 default, 0 interactive, 4 GC (on 16 virtual cores)

Happy to help further if possible! Thanks

DiamonDinoia commented 1 month ago

Hi @AaronGhost,

@ahbarnett I think the issue is here: https://github.com/flatironinstitute/finufft/blob/cdcccbec2585ae2ab5bf2e0be27bac1c526fa6c3/src/finufft.cpp#L764

Can we require FFTW 3.3.9 instead of 3.3.6? 3.3.9 Is quite old now.

Thanks, Marco

AaronGhost commented 1 month ago

I did some more tests setting both nthreads and FFTW.set_num_threads():

Also unrelated but setting the fftw option to something >= 3 makes the code hangs forever.

ahbarnett commented 1 month ago

interesting. Have you tried doing a "dummy" fftw call before the first call to finufft? (I know this was needed in matlab). We are releasing v2.3 that allows avoiding fftw altogether, but the Julia wrapper may take a little while to catch up.

Re fftw option, you know it only has certain valid values, viz docs:

int fftw; // plan flags to FFTW (FFTW_ESTIMATE=64, FFTW_MEASURE=0,...)

On Tue, Aug 13, 2024 at 6:36 PM AaronGhost @.***> wrote:

I did some more tests setting both nthreads and FFTW.set_num_threads():

  • FINUFFT.jl uses the maximum number of threads on the machine even when Julia is started with a single thread
  • Setting the number of threads to 1 on both library avoid the crash
  • Matching the number of threads on both libraries still trigger the crash

Also unrelated but setting the fftw option to something >= 3 makes the code hangs forever.

— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/61#issuecomment-2287279172, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSWG6EBR44FZY2DLWY3ZRKC2TAVCNFSM6AAAAABMOWRPTGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBXGI3TSMJXGI . You are receiving this because you were mentioned.Message ID: @.***>

-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

ludvigak commented 1 month ago

Can we require FFTW 3.3.9 instead of 3.3.6? 3.3.9 Is quite old now.

This should be fine from the Julia standpoint since the latest FFTW.jl (v1.8.0) uses FFTW v3.3.9.

Can this be reproduced in C++ code or is it somehow linked to the Julia interface?

AaronGhost commented 1 month ago

I did some additional tests:

Julia threads NUFFT threads FFTW threads set explicitely Dummy FFTW FFTW threads Crash observed
1 8 1 No
1 8 Yes 1 No
8 8 1 Yes
8 8 Yes 1 No
8 8 Yes 8 Yes
8 8 Yes 4 No
8 8 1 1 Yes
8 8 Yes 1 8 **Yes***
8 8 Yes 8 8 **Yes***
8 8 Yes 4 4 No

I can trigger the last two crashes without calling FINUFFT altogether so there may be something going on with the FFTW library in julia. The crash exception happens at a different place:

Exception: EXCEPTION_ACCESS_VIOLATION at 0x639126e4 -- .text at C:\Users\anon\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3f-3.dll (unknown line)
in expression starting at path\to\reproducer.jl:33
.text at C:\Users\anon\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3f-3.dll (unknown line)
apply_extra_iter at C:\Users\anon\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3f-3.dll (unknown line)
.text at C:\Users\anon\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3f-3.dll (unknown line)
.text at C:\Users\anon\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3f-3.dll (unknown line)
#2 at C:\Users\anon\.julia\packages\FFTW\6nZei\src\providers.jl:58
unknown function (ip: 0000029b46301abc)
jl_apply at C:/workdir/src\julia.h:2156 [inlined]
start_task at C:/workdir/src\task.c:1202
Allocations: 1397043 (Pool: 1396955; Big: 88); GC: 4

I will open an issue with FFTW.jl too, I guess.

I don't know if some of the issues would disappear if FINUFFT_jll built FFTW independently from FFTW.jl (#55) ?

DiamonDinoia commented 1 month ago

The number of threads used in fftw by finufft is not exactly the same as the number of threads finufft uses. But it is lower equal. Hence assigning more threads to finufft and less to fftw can not trigger the crash.

DiamonDinoia commented 1 month ago

The problem we see here is (yet another) issue with fftw use of a global state. Bundling fftw may fix the issue, but I think is better to just require fftw 3.3.9 which is shipped by fftw.jl anyway. Possibly using FFTW.set_provider!("mkl") might fix the issue for now.

ahbarnett commented 1 month ago

PS Steven G Johnson is intimately involved with FFTW.jl so might be the right person to ask (on Julia Discourse) about the FFTW-only crash...

On Wed, Aug 14, 2024 at 10:15 AM Marco Barbone @.***> wrote:

The problem we see here is (yet another) issue with fftw use of a global state. Bundling fftw can fix the issue, but I think is better to just require fftw 3.3.9 which is shipped by fftw.jl anyway. Possibly using FFTW.set_provider!("mkl") might fix the issue for now.

— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/61#issuecomment-2288901928, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSTLSVZXRWAB6EZ32NDZRNQ7PAVCNFSM6AAAAABMOWRPTGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBYHEYDCOJSHA . You are receiving this because you were mentioned.Message ID: @.***>

-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

ludvigak commented 3 weeks ago

I submitted an issue on FFTW.jl with the smallest FINUFFT-free example that I could get to crash: https://github.com/JuliaMath/FFTW.jl/issues/306