MagneticResonanceImaging / MRIReco.jl

Julia Package for MRI Reconstruction
https://magneticresonanceimaging.github.io/MRIReco.jl/latest/
Other
85 stars 22 forks source link

Segmentation fault #142

Open monreal93 opened 1 year ago

monreal93 commented 1 year ago

Hi,

I have been using MRIReco.jl to reconstruct high-resolution 3D spiral data for some time, recently I started getting a "Segmentation fault" error. This error is not reproducible, sometimes it crashes after the reconstruction of several volumes and sometimes when reconstructing the first one. This is the complete error:

signal (11): Segmentation fault
in expression starting at /usr/share/sosp_vaso/recon/reconstructions.jl:50
n1bv_8 at /root/.julia/artifacts/e95ca94c82899616429924e9fdc7eccda275aa38/lib/libfftw3f.so (unknown line)
apply_extra_iter at /root/.julia/artifacts/e95ca94c82899616429924e9fdc7eccda275aa38/lib/libfftw3f.so (unknown line)
apply_dit at /root/.julia/artifacts/e95ca94c82899616429924e9fdc7eccda275aa38/lib/libfftw3f.so (unknown line)
spawn_apply at /root/.julia/artifacts/e95ca94c82899616429924e9fdc7eccda275aa38/lib/libfftw3f.so (unknown line)
#2 at ./threadingconstructs.jl:258
unknown function (ip: 0x7f113e364c2f)
_jl_invoke at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/gf.c:2377 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/gf.c:2559
jl_apply at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/julia.h:1843 [inlined]
start_task at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/task.c:931
Allocations: 536018221 (Pool: 535422636; Big: 595585); GC: 94

Seems like a problem with NFFT.jl/FFTW.jl, it looks like some changes have been made to NFFT.jl: Performance Satuts (not sure if those changes can be related) . I also found a very similar issue, that hasn't been closed, I also tried starting julia with JULIA_COPY_STACKS="yes/no" and the problem is not fixed.

I am running julia inside a docker container, this is the output of versioninfo()

Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 36 × Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake-avx512)
  Threads: 16 on 36 virtual cores
Environment:
  JULIA_GPG = 3673DF529D9049477F76B37566E3C7DC03D6E495
  JULIA_VERSION = 1.8.5
  JULIA_PATH = /usr/local/julia

Has anybody got a similar problem? Is there any advice on how to avoid it?

Thank you,

tknopp commented 1 year ago

This is hard to track down further. I think it would be good to find out what specific FFTW call is being made within NFFT and then try to develop a minimal example that reproduces the error. To do so you can put some @info at this code https://github.com/JuliaMath/NFFT.jl/blob/master/src/implementation.jl#L87 to get the exact parameters that are passed to FFTW.

JakobAsslaender commented 1 year ago

I think this is an issue in FFTW -- and I don't think a solution is in sight. I circumvent the problem by using MKL instead.

using FFTW
FFTW.set_provider!("mkl")

You will have to restart Julia after changing the FFT package.

tknopp commented 1 year ago

Yes, if that fixes the issue for you locally, that seems to be the best option and there also seem to be no larger downsides.

But in the long run it would be still good to have some reproducer that only involves FFTW. With that it would be much easier to track this down for the FFTW people.

JakobAsslaender commented 1 year ago

I already filed an issue.