ludvigak / FINUFFT.jl

Julia interface to the nonuniform FFT library FINUFFT
Other
33 stars 9 forks source link

Multiple plans simultaneously #39

Closed tknopp closed 2 years ago

tknopp commented 2 years ago

Hi,

in the FINUFFT wrapper, I tried to create two plans, one for type 1 and one for type 2:

https://github.com/JuliaMath/NFFT.jl/blob/master/Wrappers/FINUFFT.jl

However, it seems the plans are not independent. Can it be that they access some global state? For now I moved the code into the nfft! and nfft_adjoint! functions and with that it works without problems. When I move both plans into the constructor of the type, I get either a wrong result, or a segfault.

Cheers,

Tobias

ahbarnett commented 2 years ago

Dear Tobias, Oh dear, that's a test we haven't done from the Julia side. This is a complex issue. I believe any global state comes from that fact that FFTW does have internal global state, which has been annoying for us (but it allows them to reuse FFTW plans). Other than that, finufft has one global state var to track whether FFTW has ben initialized, which should be changed in a thread-safe manner since v2.0.3, see: https://github.com/flatironinstitute/finufft/blob/3f0a12b06807508e571b25a10ca2e9332fb9a7d3/src/finufft.cpp#L638

But we did not test, even on the C++ side, creation of simultaneous indep plans, for nthreads=1 or more. I don't think anyone has needed that yet :)

If you look in finufft CHANGELOG:

V 2.0.3 (4/22/21)

* finufft (plan) now thread-safe via OMP lock (if nthr=1 and
-DFFTW_PLAN_SAFE)
  + new example/threadsafe*.cpp demos. Needs FFTW>=3.3.6 (Issues #72 #180
#183)

This shows if you compile with that -D flag then the .so lib is thread-safe when using nthreads=1. The C++ tests examples/threadsafe1d1.cpp etc. test separate threads each calling finufft with nthr=1.

But we did not build that compile flag into the julia build, I believe, but don't know how to check this ... @ludvigak ?

FFTW_jll is at 3.3.10, so we could always use this flag? But I remember there were CI issues needing older FFTW, hence it was not baked into our FINUFFT makefile.

This is a tricky issue and we will get back to you ... Libin @lu1and10 maybe you have thoughts on this? For now, can you avoid creating multiple simultaneous plans?

Meanwhile if you have a minimally complete code that breaks, please post it. Thanks so much ! Alex

On Wed, Jan 26, 2022 at 10:40 AM Tobias Knopp @.***> wrote:

Hi,

in the FINUFFT wrapper, I tried to create two plans, one for type 1 and one for type 2:

https://github.com/JuliaMath/NFFT.jl/blob/master/Wrappers/FINUFFT.jl

However, it seems the plans are not independent. Can it be that they access some global state? For now I moved the code into the nfft! and nfft_adjoint! functions and with that it works without problems. When I move both plans into the constructor of the type, I get either a wrong result, or a segfault.

Cheers,

Tobias

— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/39, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSX5ZQJD5ONSIUIUVKLUYAIVZANCNFSM5M3LZQBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- *---------------------------------------------------------------------~^`^~._.~' |\ Alex H. Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

tknopp commented 2 years ago

Ok thanks for the feedback. Please don't worry about issues that I find. I just want to document them directly, so that I don't forget.

For now, can you avoid creating multiple simultaneous plans?

Sure, I moved plan creation into the plan execution phase for now. My only concern is that this makes benchmarking a little bit unfair for FINUFFT, since it does more work. But we can fix that at a later time point.

ludvigak commented 2 years ago

@ahbarnett it looks like we are using -DFFTW_PLAN_SAFE, you can see the build flags here: https://github.com/JuliaPackaging/Yggdrasil/blob/6d296a128a849e1e2202d2a4ee33ded6147be33f/F/finufft/build_tarballs.jl

tknopp commented 2 years ago

What would be interesting for me if that is a limitation of the Julia wrapper or of FINUFFT in general. So is it possible to create two plans in C and execute them one after each other? So it is all about the ability cache two plans for a series of type 1 and 2 NFFTs.

ahbarnett commented 2 years ago

Dear Tobias, I just created examples/simulplans1d1.cpp for you, in master. Please try it out. It works fine for me. Please expand on this demo to try and figure out what's crashing for you...

Best, Alex

On Tue, Feb 8, 2022 at 7:09 AM Tobias Knopp @.***> wrote:

What would be interesting for me if that is a limitation of the Julia wrapper or of FINUFFT in general. So is it possible to create two plans in C and execute them one after each other? So it is all about the ability cache two plans for a series of type 1 and 2 NFFTs.

— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/39#issuecomment-1032538660, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSXAAMESE2HO5EK327LU2EBV3ANCNFSM5M3LZQBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- *---------------------------------------------------------------------~^`^~._.~' |\ Alex H. Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

tknopp commented 2 years ago

Thanks for verifying that the C side looks good. I have been playing around with this as well and the issue just arises if I put a finufft_plan into a struct. Otherwise FINUFFT.jl seems to work fine. This indicates that there is some issue with garbage collection within FINUFFT.jl.

tknopp commented 2 years ago

Here is an example that just uses FINUFFT.jl and segfaults:

using FINUFFT, Statistics, BenchmarkTools

function test()
  N = (1024,1024) 
  M = prod(N)

  x = 2π .* (rand(length(N),M) .- 0.5)

  fHat = randn(Complex{Float64}, M)
  f = randn(Complex{Float64}, N)

  reltol = 1e-9

  plan1 = FINUFFT.finufft_makeplan(1,collect(N),-1,1,reltol;
                        upsampfac=2.0, debug=0)

  nodes = ntuple(d->vec(x[d,:]), 2)
  FINUFFT.finufft_setpts!(plan1, nodes...)

  plan2 = FINUFFT.finufft_makeplan(2,collect(N),-1,1,reltol;
                        upsampfac=2.0, debug=0)

  nodes = ntuple(d->vec(x[d,:]), 2)
  FINUFFT.finufft_setpts!(plan2, nodes...)

  @belapsed FINUFFT.finufft_exec!($plan1, $fHat, $f)
  @belapsed FINUFFT.finufft_exec!($plan2, $f, $fHat)

  FINUFFT.finufft_destroy!(plan1)
  FINUFFT.finufft_destroy!(plan2)
end
lu1and10 commented 2 years ago

Dear Tobias,

Thanks for the test function, I will investigate it and get back to you later this week.

On Tue, Feb 15, 2022 at 3:19 PM Tobias Knopp @.***> wrote:

Here is an example that just uses FINUFFT.jl and segfaults:

using FINUFFT, Statistics, BenchmarkTools

function test()

N = (1024,1024)

M = prod(N)

x = 2π .* (rand(length(N),M) .- 0.5)

fHat = randn(Complex{Float64}, M)

f = randn(Complex{Float64}, N)

reltol = 1e-9

plan1 = FINUFFT.finufft_makeplan(1,collect(N),-1,1,reltol;

                    upsampfac=2.0, debug=0)

nodes = ntuple(d->vec(x[d,:]), 2)

FINUFFT.finufft_setpts!(plan1, nodes...)

plan2 = FINUFFT.finufft_makeplan(2,collect(N),-1,1,reltol;

                    upsampfac=2.0, debug=0)

nodes = ntuple(d->vec(x[d,:]), 2)

FINUFFT.finufft_setpts!(plan2, nodes...)

@belapsed FINUFFT.finufft_exec!($plan1, $fHat, $f)

@belapsed FINUFFT.finufft_exec!($plan2, $f, $fHat)

FINUFFT.finufft_destroy!(plan1)

FINUFFT.finufft_destroy!(plan2)

end

— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/39#issuecomment-1040751467, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMXDUKENNCCBFKEMJK33ULTU3KYL7ANCNFSM5M3LZQBA . You are receiving this because you were mentioned.Message ID: @.***>

tknopp commented 2 years ago

I think the issue is that you cannot simply store a C pointer in Julia without indicating to the garbage collector that you hold a reference. This is right now just a guess since I have not used ccall for quite some time.

ahbarnett commented 2 years ago

You should ask Steven G Johnson at Julia discourse since he wrote ccall, and is very responsive and amazing.

On Tue, Feb 15, 2022 at 3:28 PM Tobias Knopp @.***> wrote:

I think the issue is that you cannot simply store a C pointer in Julia without indicating to the garbage collector that you hold a reference. This is right now just a guess since I have not used ccall for quite some time.

— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/39#issuecomment-1040760590, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSXOLKJPLJRI4SKNSXTU3KZPBANCNFSM5M3LZQBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- *---------------------------------------------------------------------~^`^~._.~' |\ Alex H. Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

ludvigak commented 2 years ago

I haven't tried this, but I'm pretty sure this is because finufft_setpts! only stores a pointer to nodes. That data then gets garbage collected when you redefine nodes, causing the whole thing to later segfault.

This is a bit tricky, but actually works just the way the C interface does. But maybe we should consider storing a pointer in the Julia plan struct, the way you do in the Python interface.

tknopp commented 2 years ago

No that is not the issue. nodes... will create copies.

The issue is much more fundamental. The C library requires that you create memory for a finufftf_plan. Then one passes a pointer to this memory chunk and finufft_makeplan will fill the members of that struct.

The Julia bindings, however, do not create memory and just pass Ref{Ptr{Cvoid}} to finufft_makeplan. But how should Julia/ccall know how large finufftf_plan is? It is interesting that this does not crash always.

One solution to this problem is to replicate the C struct, which allows one to create the memory on the Julia side. One other might be to make a sizeof on your C struct and just allocate that amount of bytes in a Julia array (Vector{UInt8}).

ludvigak commented 2 years ago

No, the finufft_plan is just a pointer to plan struct, which is allocated by the library. See https://github.com/flatironinstitute/finufft/blob/b897e4fbfe14fd72e9de3498c16fbce2293f8800/include/finufft_plan_eitherprec.h#L36 All the Julia interface needs to do is to keep track of that pointer, which is precisely what it does.

I still believe that the issue here is that we don't have control over Julias garbage collection. As soon as Julia can no longer see a reference to variable, such as the first nodes in your example, it is free to overwrite that piece of memory whenever it wants to.

The following modification to your example makes it run (at least on my machine)

    nodes1 = ntuple(d->vec(x[d,:]), 2)
    FINUFFT.finufft_setpts!(plan1, nodes1...)

    plan2 = FINUFFT.finufft_makeplan(2,collect(N),-1,1,reltol;
                                     upsampfac=2.0, debug=0)

    nodes2 = ntuple(d->vec(x[d,:]), 2)
    FINUFFT.finufft_setpts!(plan2, nodes2...)

    GC.@preserve nodes1 @belapsed FINUFFT.finufft_exec!($plan1, $fHat, $f)
    GC.@preserve nodes2 @belapsed FINUFFT.finufft_exec!($plan2, $f, $fHat)

I kind of think that this is a bug. Sure, we can tell users that they need to make sure that the data used in setpts still exists when the plan is executed, but the details are quite subtle (as obvious from the example).

tknopp commented 2 years ago

Ah, I missed that line, sorry. So FINUFFT_PLAN is an opaque pointer and my comment is thus of course wrong.

tknopp commented 2 years ago

You can, by the way also replace the belapsed with

  GC.gc() 

  FINUFFT.finufft_exec!(plan1, fHat, f)
  FINUFFT.finufft_exec!(plan2, f, fHat)

But I now agree with you, the issue is that nodes... is never rooted and GC therefore frees that memory.

My proposed fix would be that you extend finufft_plan{T} with references to the nodes. They can be initialized with zero length arrays in FINUFFT.finufft_makeplan and need to be replaced in FINUFFT.finufft_setpts!. This hopefully solves the issue and does not change the interface.

ludvigak commented 2 years ago

Already on it =)

ahbarnett commented 2 years ago

Great discussion. Yes, the point of our interface was not to duplicate the user's node arrays, to save RAM in large problems. But then until the exec stage these user arrays must be left alone. (Ie, setpts does not create internal copies of the user arrays, which I hope you agree would be wasteful and slow things down in large-M large-tol problems.) I hope the julia interface can persuade GC of this need. Also maybe the docs need to emphasise this more? Feel free to add to README on the Jl side, or docs/* for that matter.. Thanks, Alex

On Wed, Feb 16, 2022 at 3:02 AM Ludvig af Klinteberg < @.***> wrote:

Already on it =)

— Reply to this email directly, view it on GitHub https://github.com/ludvigak/FINUFFT.jl/issues/39#issuecomment-1041216087, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSXSSCJ7L7V4TZ5NVOTU3NK2TANCNFSM5M3LZQBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- *---------------------------------------------------------------------~^`^~._.~' |\ Alex H. Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942