JuliaParallel / MPI.jl

MPI wrappers for Julia
https://juliaparallel.org/MPI.jl/
The Unlicense
381 stars 122 forks source link

Mysterious serialization crash during bcast #881

Open alexandrebouchard opened 1 month ago

alexandrebouchard commented 1 month ago

Hi there! We have hit this cryptic error in MPI.jl's bcast(..) pasted below during a large MPI computation (used by Pigeons.jl running an exo planet posterior inference simulation from @sefffal 's Octofitter.jl).

Launching the simulation a second time the error did not pop up, but as we will do more large scale simulations on this model, we are interested in figure out potential causes for this error.

Have you guys seen that before? Could a network glitch cause that? Thank you for your time.

ERROR: LoadError: UndefRefError: access to undefined reference
Stacktrace:
  [1] getindex
    @ ./essentials.jl:13 [inlined]
  [2] getindex
    @ ./abstractarray.jl:1291 [inlined]
  [3] desertag
    @ /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:99 [inlined]
  [4] handle_deserialize(s::Serializer{IOBuffer}, b::Int32)
    @ Serialization /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:852
  [5] deserialize
    @ /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:814 [inlined]
  [6] deserialize_dict(s::Serializer{IOBuffer}, T::Type{Dict{Pair{Int64, Int64}, Any}})
    @ Serialization /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:1528
  [7] deserialize(s::Serializer{IOBuffer}, T::Type{Dict{Pair{Int64, Int64}, Any}})
    @ Serialization /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:1536
  [8] handle_deserialize(s::Serializer{IOBuffer}, b::Int32)
    @ Serialization /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:883
  [9] deserialize(s::Serializer{IOBuffer}, t::DataType)
    @ Serialization /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:1513
 [10] handle_deserialize(s::Serializer{IOBuffer}, b::Int32)
    @ Serialization /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:878
 [11] deserialize(s::Serializer{IOBuffer})
    @ Serialization /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:814
 [12] handle_deserialize(s::Serializer{IOBuffer}, b::Int32)
    @ Serialization /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:920
 [13] deserialize
    @ /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:814 [inlined]
 [14] deserialize
    @ /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/julia/1.10.0/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:801 [inlined]
 [15] deserialize
    @ ~/.julia/packages/MPI/is7GN/src/MPI.jl:17 [inlined]
 [16] bcast(obj::Nothing, root::Int32, comm::MPI.Comm)
    @ MPI ~/.julia/packages/MPI/is7GN/src/collective.jl:101
 [17] bcast