beacon-biosignals / Ray.jl

Julia API for Ray
Other
11 stars 1 forks source link

`signal (11.1): Segmentation fault in expression starting at /home/runner/work/Ray.jl/Ray.jl/test/object_store.jl:1` #119

Closed omus closed 1 year ago

omus commented 1 year ago
[ Info: Connecting function manager to GCS at 10.0.0.23:6379...

[3842] signal (11.1): Segmentation fault
in expression starting at /home/runner/work/Ray.jl/Ray.jl/test/object_store.jl:1
unknown function (ip: 0x6)
_ZNK3ray9RayObject7GetDataEv at /home/runner/work/Ray.jl/Ray.jl/ray_julia_jll/deps/bazel-bin/julia_core_worker_lib.so (unknown line)
Allocations: 19732306 (Pool: 19715448; Big: 16858); GC: 27
ERROR: LoadError: Package Ray errored during testing (received signal: 11)
Stacktrace:
 [1] pkgerror(msg::String)
   @ Pkg.Types /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Pkg/src/Types.jl:69
 [2] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, julia_args::Cmd, test_args::Cmd, test_fn::Nothing, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool)
   @ Pkg.Operations /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Pkg/src/Operations.jl:2021
 [3] test
   @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Pkg/src/Operations.jl:1902 [inlined]
 [4] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, test_fn::Nothing, julia_args::Vector{String}, test_args::Cmd, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool, kwargs::Base.Pairs{Symbol, IOContext{Base.PipeEndpoint}, Tuple{Symbol}, NamedTuple{(:io,), Tuple{IOContext{Base.PipeEndpoint}}}})
   @ Pkg.API /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Pkg/src/API.jl:441
 [5] test(pkgs::Vector{Pkg.Types.PackageSpec}; io::IOContext{Base.PipeEndpoint}, kwargs::Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:coverage, :julia_args, :force_latest_compatible_version), Tuple{Bool, Vector{String}, Bool}}})
   @ Pkg.API /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Pkg/src/API.jl:156
 [6] test(; name::Nothing, uuid::Nothing, version::Nothing, url::Nothing, rev::Nothing, path::Nothing, mode::Pkg.Types.PackageMode, subdir::Nothing, kwargs::Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:coverage, :julia_args, :force_latest_compatible_version), Tuple{Bool, Vector{String}, Bool}}})
   @ Pkg.API /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Pkg/src/API.jl:171
 [7] top-level scope
   @ ~/work/_actions/julia-actions/julia-runtest/v1/test_harness.jl:15
 [8] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [9] top-level scope
   @ none:1
in expression starting at /home/runner/work/_actions/julia-actions/julia-runtest/v1/test_harness.jl:7
Error: Process completed with exit code 1.

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6173261266/job/16755187732?pr=108

Originally posted by @omus in https://github.com/beacon-biosignals/Ray.jl/issues/108#issuecomment-1717835935

First noticed in #108 but it was decided to merge the PR as it was thought to be a bad cache of the shared library as it couldn't be reproduced. The failure appears to happen in the "object_store.jl" tests specifically in a GetData call.

omus commented 1 year ago

Noticed another failure in #118

52149] signal (11.1): Segmentation fault
in expression starting at /home/runner/work/Ray.jl/Ray.jl/test/object_store.jl:1
_ZNSt17_Function_handlerIFPhRKN3ray6BufferEEZN5jlcxx11TypeWrapperIS2_E6methodIS0_S2_JEEERS8_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEMT0_KFT_DpT1_EEUlS4_E_E9_M_invokeERKSt9_Any_dataS4_ at /home/runner/work/Ray.jl/Ray.jl/ray_julia_jll/deps/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZN5jlcxx6detail11CallFunctorIPhJRKN3ray6BufferEEE5applyEPKvNS_13WrappedCppPtrE at /home/runner/work/Ray.jl/Ray.jl/ray_julia_jll/deps/bazel-bin/julia_core_worker_lib.so (unknown line)
Data at /home/runner/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:610
jl_fptr_interpret_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:698
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
get at /home/runner/work/Ray.jl/Ray.jl/src/object_store.jl:30
macro expansion at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined]
macro expansion at /home/runner/work/Ray.jl/Ray.jl/test/object_store.jl:7 [inlined]
macro expansion at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Test/src/Test.jl:1586 [inlined]
macro expansion at /home/runner/work/Ray.jl/Ray.jl/test/object_store.jl:2 [inlined]
macro expansion at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
top-level scope at /home/runner/work/Ray.jl/Ray.jl/test/object_store.jl:2
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:903
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1903
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
_include at ./loading.jl:1963
include at ./client.jl:478 [inlined]
#5 at /home/runner/work/Ray.jl/Ray.jl/test/runtests.jl:29 [inlined]
setup_core_worker at /home/runner/work/Ray.jl/Ray.jl/test/utils.jl:19
#4 at /home/runner/work/Ray.jl/Ray.jl/test/runtests.jl:25 [inlined]
setup_ray_head_node at /home/runner/work/Ray.jl/Ray.jl/test/utils.jl:10
unknown function (ip: 0x7f235e38661f)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:624
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533
jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1903
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
_include at ./loading.jl:1963
include at ./client.jl:478
unknown function (ip: 0x7f235e3001a2)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:624
jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
exec_options at ./client.jl:280
_start at ./client.jl:522
jfptr__start_40034.clone_1 at /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
true_main at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:573
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:717
main at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/cli/loader_exe.c:59
unknown function (ip: 0x7f2376038d8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 19732932 (Pool: 19716083; Big: 16849); GC: 30

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6188934150/job/16801953157?pr=118

kleinschmidt commented 1 year ago

Haven't seen any more of these, going to close this as we fixed some stochastic segfaults with finalization of object refs etc. that may explain it?