Closed jkrimmer closed 1 week ago
I just ran your example on 5700U ryzen 2, 8-core (16 logical threads, which julia chooses), ubuntu 22.04, Julia 1.10.0, FINUFFT v3.2.0, finufft_jll v2.2.0+2, FFTW v1.8.0, FFTW_jll v3.3.10+0, and, yes, only when I uncomment the fft(x) line, it crashes. Thanks for the MWE. The implication that nthreads=1 julia nufft calls are not thread-safe.
Output:
julia> for i=1:100; issue(threaded=true); end
double free or corruption (fasttop)
[48264] signal (6.-6): Aborted
in expression starting at REPL[22]:1
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
raise at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x77568f689675)
unknown function (ip: 0x77568f6a0cfb)
unknown function (ip: 0x77568f6a29c9)
free at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
forget at
/home/alex/.julia/artifacts/e95ca94c82899616429924e9fdc7eccda275aa38/lib/libfftw3.so
(unknown line)
mkplan at
/home/alex/.julia/artifacts/e95ca94c82899616429924e9fdc7eccda275aa38/lib/libfftw3.so
(unknown line)
fftw_mkapiplan at
/home/alex/.julia/artifacts/e95ca94c82899616429924e9fdc7eccda275aa38/lib/libfftw3.so
(unknown line)
fftw_plan_guru64_dft at
/home/alex/.julia/artifacts/e95ca94c82899616429924e9fdc7eccda275aa38/lib/libfftw3.so
(unknown line)
macro expansion at /home/alex/.julia/packages/FFTW/6nZei/src/fft.jl:652
[inlined]
cFFTWPlan at /home/alex/.julia/packages/FFTW/6nZei/src/FFTW.jl:49
#plan_fft#10 at /home/alex/.julia/packages/FFTW/6nZei/src/fft.jl:782
[inlined]
plan_fft at /home/alex/.julia/packages/FFTW/6nZei/src/fft.jl:772 [inlined]
fft at /home/alex/.julia/packages/AbstractFFTs/4iQz5/src/definitions.jl:67
fft at /home/alex/.julia/packages/AbstractFFTs/4iQz5/src/definitions.jl:214
[inlined]
fft at /home/alex/.julia/packages/AbstractFFTs/4iQz5/src/definitions.jl:66
[inlined]
_inner at ./REPL[20]:5
macro expansion at ./REPL[4]:4 [inlined]
#2#threadsfor_fun#2 at ./threadingconstructs.jl:214
#2#threadsfor_fun at ./threadingconstructs.jl:181 [inlined]
#1 at ./threadingconstructs.jl:153
unknown function (ip: 0x77568e51c8b2)
_jl_invoke at
/cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:2894
[inlined]
ijl_apply_generic at
/cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at
/cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/julia.h:1982
[inlined]
start_task at
/cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/task.c:1238
Allocations: 13790939 (Pool: 12665074; Big: 1125865); GC: 273
Aborted (core dumped)
Of course, you could try the DUCC option in v2.3.0-rc1, but I don't know if Ludvig has that yet.
The PR #354 is supposed to prevent rather than cause segfaults, of course, so this is rather upsetting - I hope Ludvig has some insight.
Thanks, Alex
On Thu, Aug 22, 2024 at 7:51 AM Jonas Krimmer @.***> wrote:
Unfortunately, I stumbled another segmentation fault caused by multithreading in julia: Calling both finufft and fftw in a threaded loop yields a segmentation fault (Windows and Linux). Here, the order in which finufft and fftw are called does not matter. MWE
Make sure to initialize julia with multiple threads, e.g., run julia --threads=auto.
using FFTW using FINUFFT
function _inner(N) y = randn(ComplexF64, N) x = randn(N) nufft1d1(x, y, 1, 1e-8, N, nthreads=1)
fft(x)
end
function issue(iters=1000, N=1024; threaded=false) if threaded @.*** for k in 1:iters _inner(N) end else for k in 1:iters _inner(N) end end end
@info "single-threaded loop" issue() @info "multi-threaded loop" issue(threaded=true)
julia v1.10.4 FINUFFT v3.2.0 finufft_jll v2.2.0+2 Potential but suboptimal solution
The only mitigation approaches I have found so far is going back to libfinufft_jll v2.1.0 (where flatironinstitute/finufft#354 https://github.com/flatironinstitute/finufft/pull/354 has not been merged) or building a more recent version where fftw_make_planner_thread_safe is re-introduced which is probably not desired.
— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSQEVXWR6FW2A6KL6L3ZSXGBVAVCNFSM6AAAAABM57CGWCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4DANJYHE3DSMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942
Hi Alex,
Thank you for looking into this issue. Fortunately, I have already been able to test the DUCC option which resolves this issue. Maybe there is some clash between the FFTW-planner calls originating from julia and the ones from within finufft?
Thanks Jonas
Well, I guess the good news is that thanks to @jkrimmer we now have working crash examples for Windows, Linux, and MacOS!
Not sure if I have any insight though, seems to me like the fundamental issue is the FFTW/FINUFFT interplay with threading active, but then we should be able to reproduce this directly in C++ code.
Finally, I have been able to reproduce the issue in C++. It seems like, the finufft-internal fftw-lock does not prevent other calls to fftw from building an FFTW-plan. Please find an MWE (on Ubuntu, this MWE builds with g++ test_finufft_crash.cpp -o test_finufft_crash -I[include] -L[lib] -lfinufft -lfftw3_threads -lfftw3 -fopenmp -Wl,-rpath=[lib] -std=c++11
) in the collapsed section below. Commenting out the lock around the finufft-planner-stage prevents the segfault.
Similarly, we observe the same behavior in julia. Running the following code with multiple threads yields a segmentation fault on my machine
using FFTW
using FINUFFT
tmp = randn(ComplexF64, 1024)
Threads.@threads for k = 1:1000
plan_fft(tmp)
finufft_makeplan(1, [1024], 1, 1, 1e-6)
end
However, adding an additional lock to the finufft-planning avoids the segfault:
using FFTW
using FINUFFT
tmp = randn(ComplexF64, 1024)
lk = ReentrantLock()
Threads.@threads for k = 1:1000
plan_fft(tmp)
lock(lk)
try
finufft_makeplan(1, [1024], 1, 1, 1e-6)
finally
unlock(lk)
end
end
Robert @blackwer since you are our expert on fftw global state and thread-safety, would you be able to take a look at the above C++ MWE which causes crash apparently? (I have only verified the julia-version crash) Thanks, Alex
I don't think there is a finufft
side solution to this. There's no way for the finufft
side to detect what the application side is doing. This is actually in the discussion at https://www.fftw.org/doc/Thread-safety.html, a document I have had to look at way too many times.
One proper solution is for the application to call fftw_make_planner_thread_safe()
rather than finufft
. finufft
can do this, and in practice it's usually sufficient, but there are cases where it could still crash even with the older version that did have this call in it. E.g. if you call finufft_make_plan
before you make fftw_plan_dft_1d
in your MWE, this solution works. It doesn't when you call fftw_plan_dft_1d
directly first. It might work, but it's technically a race condition.
Another solution would be to supply a mechanism to acquire a pointer to the finufft
lock on the application side, but I don't think this is adequate for the julia
case, since that'd have to end up in the fftw
code, which doesn't make much sense.
Is it possible on the julia
side to have the FINUFFT
import make the fftw_make_planner_thread_safe()
call? Otherwise I think the only solution is a custom build either with an alternative FFT implementation or with a patch to the FFTW_INIT
routine (though this does have the race condition).
Thank you very much for the detailed explanation! I assume that making the call to fftw_make_planner_thread_safe
during the FINUFFT initialization in julia should work and provide a quick fix.
Lately, I have been considering another approach: The FFTW-interface FFTW.jl
uses julia's ReentrantLock()
to protect the FFTW-planning stage from race conditions. If we could pass an application defined lock/locking function to the finufft library, e.g., based on this ReentrantLock
, the race conditions would also be avoided in my understanding. Do you think this approach could work as well?
Lately, I have been considering another approach: The FFTW-interface
FFTW.jl
uses julia'sReentrantLock()
to protect the FFTW-planning stage from race conditions. If we could pass an application defined lock/locking function to the finufft library, e.g., based on thisReentrantLock
, the race conditions would also be avoided in my understanding. Do you think this approach could work as well?
This definitely works, and can be done straightforwardly without having to extend the API (by having two void (func*)(void *)
function pointers on the plan and an optional void *
for the lock data). This was very straightforward to implement on the finufft
side, so could feasibly make it into 2.3
(maybe?). I've written an example implementation at https://github.com/blackwer/finufft/tree/user_fftw_lock with new driver code
@ahbarnett thoughts?
This is interesting, having a global lock passed in c++ is not an issue. But what about having a Julia lock passed to finufft?
Alternatively, one has to acquire the lock in Julia and then do the entire finufft_makeplan in a critical section. Which feels wasteful as the only non thread safe code is fftw_plan.
As for 2.3.0, I would not add this to it. I would do a 2.3.1 or put it in 2.4.0 if we release this soon. 2.3.0 has been on the working for too long we should release it asap :)
This is interesting, having a global lock passed in c++ is not an issue. But what about having a Julia lock passed to finufft?
You should be able to pass from any language that lets you make a standard C-style callback function, which julia definitely lets you do, even with closures. Python does too, if you could make this be a problem there with its GIL limited threading. https://julialang.org/blog/2013/05/callback/ has details on how do do this in julia.
As for 2.3.0, I would not add this to it. I would do a 2.3.1 or put it in 2.4.0 if we release this soon. 2.3.0 has been on the working for too long we should release it asap :)
I'm in no rush, and if Jonas is happy with the stop-gap fftw_make_planner_thread_safe
option, then I am. But the provided solution does already work (I think I just fixed the last issue) and is about as language agnostic as possible with a very small code footprint.
Hi Robert,
I think this is a great idea, and should go in 2.3.1 or 2.4, since it changes the opts struct (which requires changing ctypes for py, fortran module, docs, etc). Thanks for the demonstration! It will only be used by power users, but there is no harm in having it, and gives flexibility in the future. I opened it as a PR to FINUFFT master. The jl wrapper could then provide the power jl used with ability to pass in a lock from FFTW.jl
It's amazing what a mess having multiple packages call the same library with non-thread-safe functions, is.
While DUCC gets rid of this problem, there will always be users who want FFTW due to faster in certain cases.
On Wed, Sep 4, 2024 at 3:47 PM Robert Blackwell @.***> wrote:
This is interesting, having a global lock passed in c++ is not an issue. But what about having a Julia lock passed to finufft?
You should be able to pass from any language that lets you make a standard C-style callback function, which julia definitely lets you do, even with closures. Python does too, if you could make this be a problem there with its GIL limited threading. https://julialang.org/blog/2013/05/callback/ has details on how do do this in julia.
As for 2.3.0, I would not add this to it. I would do a 2.3.1 or put it in 2.4.0 if we release this soon. 2.3.0 has been on the working for too long we should release it asap :)
I'm in no rush, and if Jonas is happy with the stop-gap fftw_make_planner_thread_safe option, then I am. But the provided solution does already work (I think I just fixed the last issue) and is about as language agnostic as possible with a very small code footprint.
— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/62#issuecomment-2329845704, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSQTRYS73P3KCZIQJ4DZU5PUDAVCNFSM6AAAAABM57CGWCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRZHA2DKNZQGQ . You are receiving this because you were mentioned.Message ID: @.***>
-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942
Dear all,
Thanks a lot for your work on this issue! I have successfully tested the user-locking approach by @blackwer in julia over at https://github.com/jkrimmer/FINUFFT.jl/tree/user_fftw_lock. However, until the required library adaptations are merged, we can use fftw_make_planner_thread_safe
(see #66 ) to resolve this issue.
Cheers
Unfortunately, I stumbled another segmentation fault caused by multithreading in julia: Calling both finufft and fftw in a threaded loop yields a segmentation fault (Windows and Linux). Here, the order in which finufft and fftw are called does not matter.
MWE
Make sure to initialize julia with multiple threads, e.g., run
julia --threads=auto
.Note: Commenting the
fft
-call or thenufft1d1
-call makes the issue disappear.General information: julia v1.10.4 FINUFFT v3.2.0 finufft_jll v2.2.0+2
Potential but suboptimal solution
The only mitigation approaches I have found so far is going back to libfinufft_jll v2.1.0 (where https://github.com/flatironinstitute/finufft/pull/354 has not been merged) or building a more recent version where
fftw_make_planner_thread_safe
is re-introduced which is probably not desired. Alternatively, building finufft with DUCC instead of FFTW works as well.