beacon-biosignals / Ray.jl

Julia API for Ray
Other
9 stars 1 forks source link

`C++ object of type N3ray6BufferE was deleted` #177

Closed omus closed 12 months ago

omus commented 1 year ago

Saw this failure while working on #176 for Julia 1.9.3:

Submit task: Error During Test at /home/runner/work/Ray.jl/Ray.jl/test/task.jl:40
  Test threw exception
  Expression: Ray.get(return_ref) == remote_ref
  C++ object of type N3ray6BufferE was deleted
  Stacktrace:
   [1] Data
     @ ~/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624 [inlined]
   [2] take!(buffer::CxxWrap.StdLib.SharedPtrAllocated{Ray.ray_julia_jll.Buffer})
     @ Ray.ray_julia_jll ~/work/Ray.jl/Ray.jl/src/ray_julia_jll/common.jl:201
   [3] deserialize_from_ray_object(x::CxxWrap.StdLib.SharedPtrAllocated{RayObject}, outer_object_ref::ObjectRef)
     @ Ray ~/work/Ray.jl/Ray.jl/src/ray_serializer.jl:81
   [4] get(obj_ref::ObjectRef)
     @ Ray ~/work/Ray.jl/Ray.jl/src/object_store.jl:31
   [5] macro expansion
     @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined]
   [6] macro expansion
     @ ~/work/Ray.jl/Ray.jl/test/task.jl:40 [inlined]
   [7] macro expansion
     @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
   [8] top-level scope
     @ ~/work/Ray.jl/Ray.jl/test/task.jl:3
Submit task: Error During Test at /home/runner/work/Ray.jl/Ray.jl/test/task.jl:41
  Test threw exception
  Expression: Ray.get(Ray.get(return_ref)) == 1
  C++ object of type N3ray6BufferE was deleted
  Stacktrace:
   [1] Data
     @ ~/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624 [inlined]
   [2] take!(buffer::CxxWrap.StdLib.SharedPtrAllocated{Ray.ray_julia_jll.Buffer})
     @ Ray.ray_julia_jll ~/work/Ray.jl/Ray.jl/src/ray_julia_jll/common.jl:201
   [3] deserialize_from_ray_object(x::CxxWrap.StdLib.SharedPtrAllocated{RayObject}, outer_object_ref::ObjectRef)
     @ Ray ~/work/Ray.jl/Ray.jl/src/ray_serializer.jl:81
   [4] get(obj_ref::ObjectRef)
     @ Ray ~/work/Ray.jl/Ray.jl/src/object_store.jl:31
   [5] macro expansion
     @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined]
   [6] macro expansion
     @ ~/work/Ray.jl/Ray.jl/test/task.jl:41 [inlined]
   [7] macro expansion
     @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
   [8] top-level scope
     @ ~/work/Ray.jl/Ray.jl/test/task.jl:3
[2023-09-29 17:11:17,887 C 3594 3594] id_def.h:25:  Check failed: binary.size() == Size() || binary.size() == 0 expected size is 28, but got data h���@c�6��c@�C�X of size 17
*** StackTrace Information ***
/home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3raylsERSoRKNS_10StackTraceE+0x57) [0x7f8e2bfe9f67] ray::operator<<()
/home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray13SpdLogMessage5FlushEv+0x373) [0x7f8e2bfec7f3] ray::SpdLogMessage::Flush()
/home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray6RayLogD1Ev+0x48) [0x7f8e2bfeca78] ray::RayLog::~RayLog()
/home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray6NodeIDC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x156) [0x7f8e2b7fc8d6] ray::NodeID::NodeID()
/home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core14FutureResolver21ProcessResolvedObjectERKNS_8ObjectIDERKNS_3rpc7AddressERKNS_6StatusERKNS5_20GetObjectStatusReplyE+0x14b) [0x7f8e2b8149fb] ray::core::FutureResolver::ProcessResolvedObject()
/home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core10CoreWorker37RegisterOwnershipInfoAndResolveFutureERKNS_8ObjectIDES4_RKNS_3rpc7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0xad) [0x7f8e2b7d2e6d] ray::core::CoreWorker::RegisterOwnershipInfoAndResolveFuture()
/home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN5jlcxx6detail11CallFunctorIvJRN3ray4core10CoreWorkerERKNS2_8ObjectIDES8_RKNS2_3rpc7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE5applyEPKvNS_13WrappedCppPtrESO_SO_SO_SO_+0x76) [0x7f8e2b61ffe6] jlcxx::detail::CallFunctor<>::apply()
[0x7f8e33125ef5]
[0x7f8e33125fd7]
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7f8e8c842f4e] ijl_apply_generic
[0x7f8e873f55e0]
[0x7f8e873fad37]
[0x7f8e873fb48e]
[0x7f8e873fb533]
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7f8e8c842f4e] ijl_apply_generic
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5ed95) [0x7f8e8c85ed95] do_call
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5e7b8) [0x7f8e8c85e7b8] eval_value
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f83b) [0x7f8e8c85f83b] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7f8e8c85f950] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7f8e8c85f950] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7f8e8c85f950] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7f8e8c85f950] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x602e2) [0x7f8e8c8602e2] jl_interpret_toplevel_thunk
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7b22c) [0x7f8e8c87b22c] jl_toplevel_eval_flex
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7ba7a) [0x7f8e8c87ba7a] jl_toplevel_eval_flex
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_toplevel_eval_in+0xab) [0x7f8e8c87cdfb] ijl_toplevel_eval_in
/opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x1726668) [0x7f8e77f26668] japi1_include_string_49015.clone_1.clone_2
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7f8e8c842f4e] ijl_apply_generic
/opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x155abfe) [0x7f8e77d5abfe] japi1__include_51048.clone_1.clone_2
[0x7f8e873ab6fb]
[0x7f8e873abc43]
[0x7f8e873abf50]
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7f8e8c842f4e] ijl_apply_generic
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5ed95) [0x7f8e8c85ed95] do_call
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5e7b8) [0x7f8e8c85e7b8] eval_value
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f406) [0x7f8e8c85f406] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7f8e8c85f950] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7f8e8c85f950] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x602e2) [0x7f8e8c8602e2] jl_interpret_toplevel_thunk
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7b22c) [0x7f8e8c87b22c] jl_toplevel_eval_flex
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7ba7a) [0x7f8e8c87ba7a] jl_toplevel_eval_flex
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_toplevel_eval_in+0xab) [0x7f8e8c87cdfb] ijl_toplevel_eval_in
/opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x1726668) [0x7f8e77f26668] japi1_include_string_49015.clone_1.clone_2
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7f8e8c842f4e] ijl_apply_generic
/opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x155abfe) [0x7f8e77d5abfe] japi1__include_51048.clone_1.clone_2
[0x7f8e87300187]
[0x7f8e873001a3]
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7f8e8c842f4e] ijl_apply_generic
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5ed95) [0x7f8e8c85ed95] do_call
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5e7b8) [0x7f8e8c85e7b8] eval_value
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f406) [0x7f8e8c85f406] eval_body
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x602e2) [0x7f8e8c8602e2] jl_interpret_toplevel_thunk
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7b22c) [0x7f8e8c87b22c] jl_toplevel_eval_flex
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7ba7a) [0x7f8e8c87ba7a] jl_toplevel_eval_flex
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_toplevel_eval_in+0xab) [0x7f8e8c87cdfb] ijl_toplevel_eval_in
/opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x175afca) [0x7f8e77f5afca] julia_exec_options_47864.clone_1.clone_2
/opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x1186d50) [0x7f8e77986d50] julia__start_40033.clone_1
/opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x1186e59) [0x7f8e77986e59] jfptr__start_40034.clone_1
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7f8e8c842f4e] ijl_apply_generic
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0xa7a16) [0x7f8e8c8a7a16] true_main
/opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(jl_repl_entrypoint+0x8f) [0x7f8e8c8a845f] jl_repl_entrypoint
/opt/hostedtoolcache/julia/1.9.3/x64/bin/julia(main+0x9) [0x401089] main
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f8e8d629d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f8e8d629e40] __libc_start_main

terminate called after throwing an instance of 'nlohmann::detail::type_error'
  what():  [json.exception.type_error.316] invalid UTF-8 byte at index 206: 0xF2

[3594] signal (6.-6): Aborted
in expression starting at /home/runner/work/Ray.jl/Ray.jl/test/task.jl:83
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
raise at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
__verbose_terminate_handler at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/vterminate.cc:95
__terminate at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/eh_terminate.cc:48
__cxa_call_terminate at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/eh_call.cc:54
__gxx_personality_v0 at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/eh_personality.cc:688
_Unwind_RaiseException_Phase2 at /workspace/srcdir/gcc-12.1.0/libgcc/unwind.inc:64
_Unwind_Resume at /workspace/srcdir/gcc-12.1.0/libgcc/unwind.inc:242
_ZN3ray8RayEvent11SendMessageERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE.cold at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZN3ray8RayEventD1Ev at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZN3ray8RayEvent11ReportEventERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_S8_PKci at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZNSt17_Function_handlerIFvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_EZN3ray12EventManagerC4EvEUlS7_S7_E_E9_M_invokeERKSt9_Any_dataS7_S7_ at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZN3ray6RayLogD1Ev at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZN3ray6NodeIDC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZN3ray4core14FutureResolver21ProcessResolvedObjectERKNS_8ObjectIDERKNS_3rpc7AddressERKNS_6StatusERKNS5_20GetObjectStatusReplyE at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZN3ray4core10CoreWorker37RegisterOwnershipInfoAndResolveFutureERKNS_8ObjectIDES4_RKNS_3rpc7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
_ZN5jlcxx6detail11CallFunctorIvJRN3ray4core10CoreWorkerERKNS2_8ObjectIDES8_RKNS2_3rpc7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE5applyEPKvNS_13WrappedCppPtrESO_SO_SO_SO_ at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line)
RegisterOwnershipInfoAndResolveFuture at /home/runner/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624
unknown function (ip: 0x7f8e33125fd6)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
_register_ownership at /home/runner/work/Ray.jl/Ray.jl/src/object_ref.jl:125
deserialize_from_ray_object at /home/runner/work/Ray.jl/Ray.jl/src/ray_serializer.jl:91
get at /home/runner/work/Ray.jl/Ray.jl/src/object_store.jl:31
unknown function (ip: 0x7f8e873fb532)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:478
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533
jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1903
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
_include at ./loading.jl:1963
include at ./client.jl:478 [inlined]
#5 at /home/runner/work/Ray.jl/Ray.jl/test/runtests.jl:36 [inlined]
setup_core_worker at /home/runner/work/Ray.jl/Ray.jl/test/utils.jl:19
#4 at /home/runner/work/Ray.jl/Ray.jl/test/runtests.jl:31 [inlined]
setup_ray_head_node at /home/runner/work/Ray.jl/Ray.jl/test/utils.jl:10
unknown function (ip: 0x7f8e873abf4f)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:624
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533
jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1903
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
_include at ./loading.jl:1963
include at ./client.jl:478
unknown function (ip: 0x7f8e873001a2)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:624
jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
exec_options at ./client.jl:280
_start at ./client.jl:522
jfptr__start_40034.clone_1 at /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
true_main at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:573
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:717
main at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/cli/loader_exe.c:59
unknown function (ip: 0x7f8e8d629d8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 22672594 (Pool: 22654423; Big: 18171); GC: 32

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6354776658/job/17261813047?pr=176

omus commented 1 year ago

Happened again against main (6d019f5a5bc9c18497d25169a047009d00a8b82c):

omus commented 1 year ago

Random theory: maybe the Bazel cache is behaving badly in that it uses the incorrect version of the shared library?

omus commented 1 year ago
Issue is also present on M1 ``` Submit task: Error During Test at /Users/runner/work/Ray.jl/Ray.jl/test/task.jl:40 Test threw exception Expression: Ray.get(return_ref) == remote_ref C++ object of type N3ray6BufferE was deleted Stacktrace: [1] Data @ ~/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624 [inlined] [2] take!(buffer::CxxWrap.StdLib.SharedPtrAllocated{Ray.ray_julia_jll.Buffer}) @ Ray.ray_julia_jll ~/work/Ray.jl/Ray.jl/src/ray_julia_jll/common.jl:201 [3] deserialize_from_ray_object(x::CxxWrap.StdLib.SharedPtrAllocated{RayObject}, outer_object_ref::ObjectRef) @ Ray ~/work/Ray.jl/Ray.jl/src/ray_serializer.jl:81 [4] get(obj_ref::ObjectRef) @ Ray ~/work/Ray.jl/Ray.jl/src/object_store.jl:31 [5] macro expansion @ ~/hostedtoolcache/julia/1.9.3/aarch64/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined] [6] macro expansion @ ~/work/Ray.jl/Ray.jl/test/task.jl:40 [inlined] [7] macro expansion @ ~/hostedtoolcache/julia/1.9.3/aarch64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined] [8] top-level scope @ ~/work/Ray.jl/Ray.jl/test/task.jl:3 Submit task: Error During Test at /Users/runner/work/Ray.jl/Ray.jl/test/task.jl:41 Test threw exception Expression: Ray.get(Ray.get(return_ref)) == 1 C++ object of type N3ray6BufferE was deleted Stacktrace: [1] Data @ ~/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624 [inlined] [2] take!(buffer::CxxWrap.StdLib.SharedPtrAllocated{Ray.ray_julia_jll.Buffer}) @ Ray.ray_julia_jll ~/work/Ray.jl/Ray.jl/src/ray_julia_jll/common.jl:201 [3] deserialize_from_ray_object(x::CxxWrap.StdLib.SharedPtrAllocated{RayObject}, outer_object_ref::ObjectRef) @ Ray ~/work/Ray.jl/Ray.jl/src/ray_serializer.jl:81 [4] get(obj_ref::ObjectRef) @ Ray ~/work/Ray.jl/Ray.jl/src/object_store.jl:31 [5] macro expansion @ ~/hostedtoolcache/julia/1.9.3/aarch64/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined] [6] macro expansion @ ~/work/Ray.jl/Ray.jl/test/task.jl:41 [inlined] [7] macro expansion @ ~/hostedtoolcache/julia/1.9.3/aarch64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined] [8] top-level scope @ ~/work/Ray.jl/Ray.jl/test/task.jl:3 [2023-10-02 19:55:56,548 C 49802 2180856] id_def.h:25: Check failed: binary.size() == Size() || binary.size() == 0 expected size is 28, but got data |�a����PN�_hA�� of size 18 *** StackTrace Information *** 0 julia_core_worker_lib.so 0x0000000132c453b4 _ZN3raylsERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEERKNS_10StackTraceE + 84 ray::operator<<() 1 julia_core_worker_lib.so 0x0000000132c72904 _ZN3ray13SpdLogMessage5FlushEv + 220 ray::SpdLogMessage::Flush() 2 julia_core_worker_lib.so 0x0000000132c72784 _ZN3ray13SpdLogMessageD2Ev + 24 ray::SpdLogMessage::~SpdLogMessage() 3 julia_core_worker_lib.so 0x0000000132c48448 _ZN3ray6RayLogD2Ev + 52 ray::RayLog::~RayLog() 4 julia_core_worker_lib.so 0x000000013233a44c _ZN3ray6NodeIDC2ERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE + 872 ray::NodeID::NodeID() 5 julia_core_worker_lib.so 0x0000000132579314 _ZN3ray4core14FutureResolver21ProcessResolvedObjectERKNS_8ObjectIDERKNS_3rpc7AddressERKNS_6StatusERKNS5_20GetObjectStatusReplyE + 824 ray::core::FutureResolver::ProcessResolvedObject() 6 julia_core_worker_lib.so 0x00000001324c64bc _ZN3ray4core10CoreWorker37RegisterOwnershipInfoAndResolveFutureERKNS_8ObjectIDES4_RKNS_3rpc7AddressERKNSt3__112basic_stringIcNS9_11char_traitsIcEENS9_9allocatorIcEEEE + 208 ray::core::CoreWorker::RegisterOwnershipInfoAndResolveFuture() 7 julia_core_worker_lib.so 0x00000001324a5540 _ZN5jlcxx6detail11CallFunctorIvJRN3ray4core10CoreWorkerERKNS2_8ObjectIDES8_RKNS2_3rpc7AddressERKNSt3__112basic_stringIcNSD_11char_traitsIcEENSD_9allocatorIcEEEEEE5applyEPKvNS_13WrappedCppPtrESP_SP_SP_SP_ + 160 jlcxx::detail::CallFunctor<>::apply() 8 ??? 0x000000014f6f40e0 0x0 + 5627658464 0x0 9 ??? 0x000000014f6f41a0 0x0 + 5627658656 0x0 10 ??? 0x000000014f1688b8 0x0 + 5621844152 0x0 11 ??? 0x000000014f1784a4 0x0 + 5621908644 0x0 12 ??? 0x000000014f188168 0x0 + 5621973352 0x0 13 ??? 0x000000014f1881f8 0x0 + 5621973496 0x0 14 libjulia-internal.1.9.dylib 0x000000010275e4ec do_call + 188 do_call 15 libjulia-internal.1.9.dylib 0x000000010275ca6c eval_body + 800 eval_body 16 libjulia-internal.1.9.dylib 0x000000010275cf98 eval_body + 2124 eval_body 17 libjulia-internal.1.9.dylib 0x000000010275cf98 eval_body + 2124 eval_body 18 libjulia-internal.1.9.dylib 0x000000010275cf98 eval_body + 2124 eval_body 19 libjulia-internal.1.9.dylib 0x000000010275cf98 eval_body + 2124 eval_body 20 libjulia-internal.1.9.dylib 0x000000010275d35c jl_interpret_toplevel_thunk + 260 jl_interpret_toplevel_thunk 21 libjulia-internal.1.9.dylib 0x000000010277459c jl_toplevel_eval_flex + 4620 jl_toplevel_eval_flex 22 libjulia-internal.1.9.dylib 0x00000001027744c0 jl_toplevel_eval_flex + 4400 jl_toplevel_eval_flex 23 libjulia-internal.1.9.dylib 0x00000001027752e4 ijl_toplevel_eval_in + 156 ijl_toplevel_eval_in 24 sys.dylib 0x000000011aaf95f4 japi1_include_string_36282.clone_3 + 520 japi1_include_string_36282.clone_3 25 libjulia-internal.1.9.dylib 0x00000001027455c4 ijl_apply_generic + 1732 ijl_apply_generic 26 sys.dylib 0x000000011a95fe98 japi1__include_47977.clone_3 + 820 japi1__include_47977.clone_3 27 ??? 0x00000001345dc31c 0x0 + 5173527324 0x0 28 libjulia-internal.1.9.dylib 0x000000010275e4ec do_call + 188 do_call 29 libjulia-internal.1.9.dylib 0x000000010275cd10 eval_body + 1476 eval_body 30 libjulia-internal.1.9.dylib 0x000000010275cf98 eval_body + 2124 eval_body 31 libjulia-internal.1.9.dylib 0x000000010275cf98 eval_body + 2124 eval_body 32 libjulia-internal.1.9.dylib 0x000000010275d35c jl_interpret_toplevel_thunk + 260 jl_interpret_toplevel_thunk 33 libjulia-internal.1.9.dylib 0x000000010277459c jl_toplevel_eval_flex + 4620 jl_toplevel_eval_flex 34 libjulia-internal.1.9.dylib 0x00000001027744c0 jl_toplevel_eval_flex + 4400 jl_toplevel_eval_flex 35 libjulia-internal.1.9.dylib 0x00000001027752e4 ijl_toplevel_eval_in + 156 ijl_toplevel_eval_in 36 sys.dylib 0x000000011aaf95f4 japi1_include_string_36282.clone_3 + 520 japi1_include_string_36282.clone_3 37 libjulia-internal.1.9.dylib 0x00000001027455c4 ijl_apply_generic + 1732 ijl_apply_generic 38 sys.dylib 0x000000011a95fe98 japi1__include_47977.clone_3 + 820 japi1__include_47977.clone_3 39 libjulia-internal.1.9.dylib 0x000000010275e4ec do_call + 188 do_call 40 libjulia-internal.1.9.dylib 0x000000010275cd10 eval_body + 1476 eval_body 41 libjulia-internal.1.9.dylib 0x000000010275d35c jl_interpret_toplevel_thunk + 260 jl_interpret_toplevel_thunk 42 libjulia-internal.1.9.dylib 0x000000010277459c jl_toplevel_eval_flex + 4620 jl_toplevel_eval_flex 43 libjulia-internal.1.9.dylib 0x00000001027744c0 jl_toplevel_eval_flex + 4400 jl_toplevel_eval_flex 44 libjulia-internal.1.9.dylib 0x00000001027752e4 ijl_toplevel_eval_in + 156 ijl_toplevel_eval_in 45 sys.dylib 0x0000000119de2d08 jlplt_ijl_toplevel_eval_in_19174 + 92 jlplt_ijl_toplevel_eval_in_19174 46 libjulia-internal.1.9.dylib 0x000000010279c644 true_main + 192 true_main 47 libjulia-internal.1.9.dylib 0x000000010279c538 jl_repl_entrypoint + 180 jl_repl_entrypoint 48 julia 0x00000001021abf6c main + 12 main 49 dyld 0x0000000194d27f28 start + 2236 start libc++abi: terminating due to uncaught exception of type nlohmann::detail::type_error: [json.exception.type_error.316] invalid UTF-8 byte at index 216: 0x9A [49802] signal (6): Abort trap: 6 ```

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6384165745/job/17327409543

omus commented 1 year ago
Another example ``` [ Info: Connecting function manager to GCS at 10.0.0.40:6379... Submit task: Error During Test at /home/runner/work/Ray.jl/Ray.jl/test/task.jl:40 Test threw exception Expression: Ray.get(return_ref) == remote_ref C++ object of type N3ray6BufferE was deleted Stacktrace: [1] Data @ ~/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624 [inlined] [2] take!(buffer::CxxWrap.StdLib.SharedPtrAllocated{Ray.ray_julia_jll.Buffer}) @ Ray.ray_julia_jll ~/work/Ray.jl/Ray.jl/src/ray_julia_jll/common.jl:201 [3] deserialize_from_ray_object(x::CxxWrap.StdLib.SharedPtrAllocated{RayObject}, outer_object_ref::ObjectRef) @ Ray ~/work/Ray.jl/Ray.jl/src/ray_serializer.jl:81 [4] get(obj_ref::ObjectRef) @ Ray ~/work/Ray.jl/Ray.jl/src/object_store.jl:31 [5] macro expansion @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined] [6] macro expansion @ ~/work/Ray.jl/Ray.jl/test/task.jl:40 [inlined] [7] macro expansion @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined] [8] top-level scope @ ~/work/Ray.jl/Ray.jl/test/task.jl:3 Submit task: Error During Test at /home/runner/work/Ray.jl/Ray.jl/test/task.jl:41 Test threw exception Expression: Ray.get(Ray.get(return_ref)) == 1 C++ object of type N3ray6BufferE was deleted Stacktrace: [1] Data @ ~/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624 [inlined] [2] take!(buffer::CxxWrap.StdLib.SharedPtrAllocated{Ray.ray_julia_jll.Buffer}) @ Ray.ray_julia_jll ~/work/Ray.jl/Ray.jl/src/ray_julia_jll/common.jl:201 [3] deserialize_from_ray_object(x::CxxWrap.StdLib.SharedPtrAllocated{RayObject}, outer_object_ref::ObjectRef) @ Ray ~/work/Ray.jl/Ray.jl/src/ray_serializer.jl:81 [4] get(obj_ref::ObjectRef) @ Ray ~/work/Ray.jl/Ray.jl/src/object_store.jl:31 [5] macro expansion @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined] [6] macro expansion @ ~/work/Ray.jl/Ray.jl/test/task.jl:41 [inlined] [7] macro expansion @ /opt/hostedtoolcache/julia/1.9.3/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined] [8] top-level scope @ ~/work/Ray.jl/Ray.jl/test/task.jl:3 [2023-10-03 15:58:46,897 C 53904 53904] id_def.h:25: Check failed: binary.size() == Size() || binary.size() == 0 expected size is 28, but got data �Uiץh�0*��o�2*�.����;J� of size 27 *** StackTrace Information *** /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3raylsERSoRKNS_10StackTraceE+0x57) [0x7fd3ce9e9f67] ray::operator<<() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray13SpdLogMessage5FlushEv+0x373) [0x7fd3ce9ec7f3] ray::SpdLogMessage::Flush() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray6RayLogD1Ev+0x48) [0x7fd3ce9eca78] ray::RayLog::~RayLog() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray6NodeIDC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x156) [0x7fd3ce1fc8d6] ray::NodeID::NodeID() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core14FutureResolver21ProcessResolvedObjectERKNS_8ObjectIDERKNS_3rpc7AddressERKNS_6StatusERKNS5_20GetObjectStatusReplyE+0x14b) [0x7fd3ce2149fb] ray::core::FutureResolver::ProcessResolvedObject() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core10CoreWorker37RegisterOwnershipInfoAndResolveFutureERKNS_8ObjectIDES4_RKNS_3rpc7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0xad) [0x7fd3ce1d2e6d] ray::core::CoreWorker::RegisterOwnershipInfoAndResolveFuture() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN5jlcxx6detail11CallFunctorIvJRN3ray4core10CoreWorkerERKNS2_8ObjectIDES8_RKNS2_3rpc7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE5applyEPKvNS_13WrappedCppPtrESO_SO_SO_SO_+0x76) [0x7fd3ce01ffe6] jlcxx::detail::CallFunctor<>::apply() [0x7fd3d5e65335] [0x7fd3d5e65417] /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7fd42f242f4e] ijl_apply_generic [0x7fd42edf9bb0] [0x7fd42edfa917] [0x7fd42edfb06e] [0x7fd42edfb113] /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7fd42f242f4e] ijl_apply_generic /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5ed95) [0x7fd42f25ed95] do_call /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5e7b8) [0x7fd42f25e7b8] eval_value /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f83b) [0x7fd42f25f83b] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7fd42f25f950] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7fd42f25f950] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7fd42f25f950] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7fd42f25f950] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x602e2) [0x7fd42f2602e2] jl_interpret_toplevel_thunk /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7b22c) [0x7fd42f27b22c] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7ba7a) [0x7fd42f27ba7a] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_toplevel_eval_in+0xab) [0x7fd42f27cdfb] ijl_toplevel_eval_in /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x1726668) [0x7fd41a926668] japi1_include_string_49015.clone_1.clone_2 /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7fd42f242f4e] ijl_apply_generic /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x155abfe) [0x7fd41a75abfe] japi1__include_51048.clone_1.clone_2 [0x7fd42edab71b] [0x7fd42edabc63] [0x7fd42edabf70] /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7fd42f242f4e] ijl_apply_generic /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5ed95) [0x7fd42f25ed95] do_call /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5e7b8) [0x7fd42f25e7b8] eval_value /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f406) [0x7fd42f25f406] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7fd42f25f950] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f950) [0x7fd42f25f950] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x602e2) [0x7fd42f2602e2] jl_interpret_toplevel_thunk /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7b22c) [0x7fd42f27b22c] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7ba7a) [0x7fd42f27ba7a] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_toplevel_eval_in+0xab) [0x7fd42f27cdfb] ijl_toplevel_eval_in /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x1726668) [0x7fd41a926668] japi1_include_string_49015.clone_1.clone_2 /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7fd42f242f4e] ijl_apply_generic /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x155abfe) [0x7fd41a75abfe] japi1__include_51048.clone_1.clone_2 [0x7fd42ed00187] [0x7fd42ed001a3] /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7fd42f242f4e] ijl_apply_generic /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5ed95) [0x7fd42f25ed95] do_call /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5e7b8) [0x7fd42f25e7b8] eval_value /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x5f406) [0x7fd42f25f406] eval_body /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x602e2) [0x7fd42f2602e2] jl_interpret_toplevel_thunk /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7b22c) [0x7fd42f27b22c] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0x7ba7a) [0x7fd42f27ba7a] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_toplevel_eval_in+0xab) [0x7fd42f27cdfb] ijl_toplevel_eval_in /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x175afca) [0x7fd41a95afca] julia_exec_options_47864.clone_1.clone_2 /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x1186d50) [0x7fd41a386d50] julia__start_40033.clone_1 /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so(+0x1186e59) [0x7fd41a386e59] jfptr__start_40034.clone_1 /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x26e) [0x7fd42f242f4e] ijl_apply_generic /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(+0xa7a16) [0x7fd42f2a7a16] true_main /opt/hostedtoolcache/julia/1.9.3/x64/bin/../lib/julia/libjulia-internal.so.1(jl_repl_entrypoint+0x8f) [0x7fd42f2a845f] jl_repl_entrypoint /opt/hostedtoolcache/julia/1.9.3/x64/bin/julia(main+0x9) [0x401089] main /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fd430029d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7fd430029e40] __libc_start_main terminate called after throwing an instance of 'nlohmann::detail::type_error' what(): [json.exception.type_error.316] invalid UTF-8 byte at index 204: 0x9A [53904] signal (6.-6): Aborted in expression starting at /home/runner/work/Ray.jl/Ray.jl/test/task.jl:83 pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) raise at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) __verbose_terminate_handler at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/vterminate.cc:95 __terminate at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/eh_terminate.cc:48 __cxa_call_terminate at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/eh_call.cc:54 __gxx_personality_v0 at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/eh_personality.cc:688 _Unwind_RaiseException_Phase2 at /workspace/srcdir/gcc-12.1.0/libgcc/unwind.inc:64 _Unwind_Resume at /workspace/srcdir/gcc-12.1.0/libgcc/unwind.inc:242 _ZN3ray8RayEvent11SendMessageERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE.cold at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) _ZN3ray8RayEventD1Ev at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) _ZN3ray8RayEvent11ReportEventERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_S8_PKci at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) _ZNSt17_Function_handlerIFvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_EZN3ray12EventManagerC4EvEUlS7_S7_E_E9_M_invokeERKSt9_Any_dataS7_S7_ at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) _ZN3ray6RayLogD1Ev at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) _ZN3ray6NodeIDC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) _ZN3ray4core14FutureResolver21ProcessResolvedObjectERKNS_8ObjectIDERKNS_3rpc7AddressERKNS_6StatusERKNS5_20GetObjectStatusReplyE at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) _ZN3ray4core10CoreWorker37RegisterOwnershipInfoAndResolveFutureERKNS_8ObjectIDES4_RKNS_3rpc7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) _ZN5jlcxx6detail11CallFunctorIvJRN3ray4core10CoreWorkerERKNS2_8ObjectIDES8_RKNS2_3rpc7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE5applyEPKvNS_13WrappedCppPtrESO_SO_SO_SO_ at /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so (unknown line) RegisterOwnershipInfoAndResolveFuture at /home/runner/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624 unknown function (ip: 0x7fd3d5e65416) _jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined] ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940 _register_ownership at /home/runner/work/Ray.jl/Ray.jl/src/object_ref.jl:125 deserialize_from_ray_object at /home/runner/work/Ray.jl/Ray.jl/src/ray_serializer.jl:91 get at /home/runner/work/Ray.jl/Ray.jl/src/object_store.jl:31 unknown function (ip: 0x7fd42edfb112) _jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined] ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940 jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined] do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126 eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226 eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:478 eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533 eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533 eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533 eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533 jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762 jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912 jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856 ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971 eval at ./boot.jl:370 [inlined] include_string at ./loading.jl:1903 _jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined] ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940 _include at ./loading.jl:1963 include at ./client.jl:478 [inlined] #5 at /home/runner/work/Ray.jl/Ray.jl/test/runtests.jl:36 [inlined] setup_core_worker at /home/runner/work/Ray.jl/Ray.jl/test/utils.jl:19 #4 at /home/runner/work/Ray.jl/Ray.jl/test/runtests.jl:31 [inlined] setup_ray_head_node at /home/runner/work/Ray.jl/Ray.jl/test/utils.jl:10 unknown function (ip: 0x7fd42edabf6f) _jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined] ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940 jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined] do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126 eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226 eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined] eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:624 eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533 eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:533 jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762 jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912 jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856 ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971 eval at ./boot.jl:370 [inlined] include_string at ./loading.jl:1903 _jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined] ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940 _include at ./loading.jl:1963 include at ./client.jl:478 unknown function (ip: 0x7fd42ed001a2) _jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined] ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940 jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined] do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126 eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226 eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined] eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:624 jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762 jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912 jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856 ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971 eval at ./boot.jl:370 [inlined] exec_options at ./client.jl:280 _start at ./client.jl:522 jfptr__start_40034.clone_1 at /opt/hostedtoolcache/julia/1.9.3/x64/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined] ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940 jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined] true_main at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:573 jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:717 main at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/cli/loader_exe.c:59 unknown function (ip: 0x7fd430029d8f) __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) unknown function (ip: 0x4010b8) Allocations: 22671710 (Pool: 22653538; Big: 18172); GC: 29 ``` – https://github.com/beacon-biosignals/Ray.jl/actions/runs/6395254236/job/17358479946?pr=179
omus commented 1 year ago

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6396789493/job/17363677925?pr=179

kleinschmidt commented 1 year ago

suspicious that our old friend RegisterOwnershipInfoAndResolveFuture shows up in teh stack trace, I'll look at the others as well

kleinschmidt commented 1 year ago

this one seems to be mangling the owner address somehow (27 bytes instead of 28, maybe a string conversion issue?) https://github.com/beacon-biosignals/Ray.jl/issues/177#issuecomment-1745342806

kleinschmidt commented 1 year ago

Random theory: maybe the Bazel cache is behaving badly in that it uses the incorrect version of the shared library?

I don't remember seeing anything like this before the sub-package/sub-module shuffle, but then again the source hasn't changed since then so I don't really know how that could be responsible. I kinda find it hard to believe that we just were getting lucky though...

omus commented 1 year ago

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6436720204/job/17480614493?pr=186

omus commented 1 year ago

Random theory: maybe the Bazel cache is behaving badly in that it uses the incorrect version of the shared library?

In #179 I changed the cache names and also fixed an issue where the hashFiles check wasn't being used. If this was just a bad Bazel cache issue I wouldn't expect this issue to have persisted past that PR. Also, as re-running the jobs pass and those would be based off of the same cache information I'm doubtful this theory is correct anymore.

kleinschmidt commented 1 year ago

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6436720204/job/17480614493?pr=186#step:12:210 suggests that the worker is dying for some reason (error code 0 is WORKER_DIED)

...which makes sense because we see a RAY_CHECK failure right after:

[2023-10-06 21:43:21,492 C 8415 8415] id_def.h:25:  Check failed: binary.size() == Size() || binary.size() == 0 expected size is 28, but got data ��"��mr,�+�|��y~ of size 20

There are two things I can think of off the top of my head that might be causing this:

  1. null bytes in the string that cause it somehow to be truncated (would explain the "too short" NodeID field in address, but I have no idea why this is only cropping up NOW)
  2. memory is getting mangled somewhere (although if it was actually released/deallocated, I'd expect a segfault, not this...)

So maybe we should at least temporarily add some more println debugging to print out the address string before we try to register ownership?

omus commented 1 year ago

Had a breakthrough with #189:

Error During Test at /home/runner/work/Ray.jl/Ray.jl/test/task.jl:40
  Test threw exception
  Expression: Ray.get(return_ref) == remote_ref
  Encountered unhandled metadata from `ObjectRef("f4402ec78d3a2607ffffffffffffffffffffffff0100000001000000")`: 0

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6437112800/job/17481699838?pr=189

The 0 indicates an ErrorType of WORKER_DIED.

omus commented 1 year ago

It occurred to me that with the latest discovery we should definitely upload the /tmp/ray/session_latest CI logs to an Artifact for further analysis. I'll work on a PR to put that in place.

omus commented 1 year ago

Finally got a failure again after #192 was merged which now lets us inspect the server side logs: https://github.com/beacon-biosignals/Ray.jl/actions/runs/6474941536

omus commented 1 year ago

Interesting parts from the backend logs. The specific worker that falied (PID 6021) didn't report anything interesting.

raylet.out ``` [2023-10-10 21:22:40,094 D 5673 5673] (raylet) ray_syncer-inl.h:274: [BidiReactor] Sending message to 00000000000000000000000000000000000000000000000000000000 about node 5bae06d165e28a8777bb3a3a7487448d0090fa963a2d77d02c086cca with flush 1 [2023-10-10 21:22:40,289 D 5673 5687] (raylet) store.cc:467: Disconnecting client on fd 44 [2023-10-10 21:22:40,289 D 5673 5687] (raylet) store.cc:331: Disconnecting client on fd -1 [2023-10-10 21:22:40,289 D 5673 5673] (raylet) node_manager.cc:1213: [Worker] Message DisconnectClient(7) from worker with PID 6021 [2023-10-10 21:22:40,289 I 5673 5673] (raylet) node_manager.cc:1449: NodeManager::DisconnectClient, disconnect_type=0, has creation task exception = true [2023-10-10 21:22:40,289 D 5673 5673] (raylet) dependency_manager.cc:149: Canceling get request for worker 322df0ba57d2af54475c0fb6b6bea69b3484e190e96fcd4d60889416 [2023-10-10 21:22:40,289 D 5673 5673] (raylet) dependency_manager.cc:94: Canceling wait request for worker 322df0ba57d2af54475c0fb6b6bea69b3484e190e96fcd4d60889416 [2023-10-10 21:22:40,289 D 5673 5673] (raylet) accessor.cc:855: Reporting worker failure, worker_id: "2-\360\272W\322\257TG\\\017\266\266\276\246\2334\204\341\220\3 51o\315M`\210\224\026" [2023-10-10 21:22:40,289 D 5673 5673] (raylet) cluster_resource_manager.cc:63: Update node info, node_id: 5792679963842218173, node_resources: {node:10.0.0.121: 10000/10000, memory: 385742241800000/385742241800000, object_store_memory: 192868048600000/192871120890000, CPU: 160000/160000} [2023-10-10 21:22:40,289 D 5673 5673] (raylet) accessor.cc:815: Publishing job error, job id = 01000000 [2023-10-10 21:22:40,289 D 5673 5673] (raylet) worker_pool.cc:1629: DeleteRuntimeEnvIfPossible {"julia_command":["/opt/hostedtoolcache/julia/1.8.5/x64/bin/julia","-Cnative","-J/opt/hostedtoolcache/julia/1.8.5/x64/lib/julia/sys.so","--depwarn=yes","--check-bounds=yes","-g1","--code-coverage=user","--color=yes","--startup-file=no","-e","using Ray; start_worker()"],"env_vars":{"JULIA_PROJECT":"/tmp/jl_duBsYG"}} [2023-10-10 21:22:40,289 D 5673 5673] (raylet) node_manager.cc:817: Received a HandleGetTaskFailureCause request for task d2b678ce394bc49619123cec5da8f5caa0eb2dfe01000000 [2023-10-10 21:22:40,289 I 5673 5673] (raylet) node_manager.cc:827: didn't find failure cause for task d2b678ce394bc49619123cec5da8f5caa0eb2dfe01000000 [2023-10-10 21:22:40,290 D 5673 5673] (raylet) accessor.cc:824: Finished publishing job error, job id = 01000000 [2023-10-10 21:22:40,290 D 5673 5673] (raylet) accessor.cc:865: Finished reporting worker failure, worker_id: "2-\360\272W\322\257TG\\\017\266\266\276\246\2334\204\ 341\220\351o\315M`\210\224\026" , status = OK ```
raylet.err ``` [2023-10-10 21:22:40,279 C 6021 6021] id_def.h:25: Check failed: binary.size() == Size() || binary.size() == 0 expected size is 28, but got data [^Fe⊇w ::t<87>D<8D> of size 16 *** StackTrace Information *** /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3raylsERSoRKNS_10StackTraceE+0x57) [0x7f312cfece27] ray::operator<<() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray13SpdLogMessage5FlushEv+0x373) [0x7f312cfef6b3] ray::SpdLogMessage::Flush() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray6RayLogD1Ev+0x48) [0x7f312cfef938] ray::RayLog::~RayLog() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray6NodeIDC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x156) [0x7f312c7ff 7f6] ray::NodeID::NodeID() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core14FutureResolver21ProcessResolvedObjectERKNS_8ObjectIDERKNS_3rpc7AddressERKNS_6 StatusERKNS5_20GetObjectStatusReplyE+0x14b) [0x7f312c81791b] ray::core::FutureResolver::ProcessResolvedObject() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core10CoreWorker37RegisterOwnershipInfoAndResolveFutureERKNS_8ObjectIDES4_RKNS_3rpc 7AddressERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0xad) [0x7f312c7d5d8d] ray::core::CoreWorker::RegisterOwnershipInfoAndResolveFuture() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN5jlcxx6detail11CallFunctorIvJRN3ray4core10CoreWorkerERKNS2_8ObjectIDES8_RKNS2_3rpc7Addre ssERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE5applyEPKvNS_13WrappedCppPtrESO_SO_SO_SO_+0x76) [0x7f312c621e36] jlcxx::detail::CallFunctor<>::apply() [0x7f31d8564a6f] [0x7f31d8564b82] /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x30e) [0x7f31d8e49dbe] ijl_apply_generic [0x7f31d85320d1] [0x7f31d8533bae] [0x7f31d85344b5] [0x7f31d8536ba6] [0x7f31d8538de5] /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(+0xb1db62) [0x7f312c71db62] std::_Function_handler<>::_M_invoke() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core10CoreWorker11ExecuteTaskERKNS_17TaskSpecificationERKSt10shared_ptrISt13unorder ed_mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt6vectorISt4pairIldESaISF_EESt4hashISC_ESt8equal_toISC_ESaISE_IKSC_SH_EEEEPSD_ISE_INS_8ObjectIDES5_INS_9 RayObjectEEESaISW_EESZ_PN6google8protobuf16RepeatedPtrFieldINS_3rpc20ObjectReferenceCountEEEPbPSC_+0xc80) [0x7f312c80c140] ray::core::CoreWorker::ExecuteTask() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZNSt17_Function_handlerIFN3ray6StatusERKNS0_17TaskSpecificationESt10shared_ptrISt13unorder ed_mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt6vectorISt4pairIldESaISF_EESt4hashISC_ESt8equal_toISC_ESaISE_IKSC_SH_EEEEPSD_ISE_INS0_8ObjectIDES5_INS0 _9RayObjectEEESaISU_EESX_PN6google8protobuf16RepeatedPtrFieldINS0_3rpc20ObjectReferenceCountEEEPbPSC_ESt5_BindIFMNS0_4core10CoreWorkerEFS1_S4_RKSQ_SX_SX_S14_S15_S16 _EPS1A_St12_PlaceholderILi1EES1G_ILi2EES1G_ILi3EES1G_ILi4EES1G_ILi5EES1G_ILi6EES1G_ILi7EEEEE9_M_invokeERKSt9_Any_dataS4_OSQ_OSX_S1V_OS14_OS15_OS16_+0x58) [0x7f312c7 3d948] std::_Function_handler<>::_M_invoke() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(+0xc33406) [0x7f312c833406] ray::core::CoreWorkerDirectTaskReceiver::HandleTask()::{lambda( )#1}::operator()() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(+0xc3462b) [0x7f312c83462b] std::_Function_handler<>::_M_invoke() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core14InboundRequest6AcceptEv+0x6e) [0x7f312c847e5e] ray::core::InboundRequest::Acc ept() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core21NormalSchedulingQueue16ScheduleRequestsEv+0x3a9) [0x7f312c819e69] ray::core:: NormalSchedulingQueue::ScheduleRequests() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN12EventTracker15RecordExecutionERKSt8functionIFvvEESt10shared_ptrI11StatsHandleE+0x69) [ 0x7f312caf16a9] EventTracker::RecordExecution() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(+0xe8caa0) [0x7f312ca8caa0] std::_Function_handler<>::_M_invoke() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN5boost4asio6detail18completion_handlerISt8functionIFvvEENS0_10io_context19basic_executor _typeISaIvELm0EEEE11do_completeEPvPNS1_19scheduler_operationERKNS_6system10error_codeEm+0x9b) [0x7f312ca8cf7b] boost::asio::detail::completion_handler<>::do_complet e() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN5boost4asio6detail9scheduler10do_run_oneERNS1_27conditionally_enabled_mutex11scoped_lock ERNS1_21scheduler_thread_infoERKNS_6system10error_codeE+0x3a3) [0x7f312d04da43] boost::asio::detail::scheduler::do_run_one() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN5boost4asio6detail9scheduler3runERNS_6system10error_codeE+0x119) [0x7f312d052c09] boost: :asio::detail::scheduler::run() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN5boost4asio10io_context3runEv+0x46) [0x7f312d054496] boost::asio::io_context::run() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core10CoreWorker20RunTaskExecutionLoopEv+0x20) [0x7f312c7ce8a0] ray::core::CoreWork er::RunTaskExecutionLoop() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core21CoreWorkerProcessImpl26RunWorkerTaskExecutionLoopEv+0x90) [0x7f312c814d40] ra y::core::CoreWorkerProcessImpl::RunWorkerTaskExecutionLoop() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN3ray4core17CoreWorkerProcess20RunTaskExecutionLoopEv+0x21) [0x7f312c814ef1] ray::core::C oreWorkerProcess::RunTaskExecutionLoop() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_Z17initialize_workerNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_S4_S4_illPv+0x3 25) [0x7f312c72a1e5] initialize_worker() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZNSt17_Function_handlerIFvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_S5_S5_ill PvEPS7_E9_M_invokeERKSt9_Any_dataOS5_SD_SD_SD_OiOlSF_OS6_+0x18a) [0x7f312c612a6a] std::_Function_handler<>::_M_invoke() /home/runner/work/Ray.jl/Ray.jl/build/bazel-bin/julia_core_worker_lib.so(_ZN5jlcxx6detail11CallFunctorIvJNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_S7_S 7_illPvEE5applyEPKvNS_13WrappedCppPtrESC_SC_SC_illS8_+0x1a4) [0x7f312c6254b4] jlcxx::detail::CallFunctor<>::apply() [0x7f31d8527715] [0x7f31d8527cd9] [0x7f31d8527f9b] /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x30e) [0x7f31d8e49dbe] ijl_apply_generic [0x7f31d8514f65] [0x7f31d851584a] [0x7f31d8515860] /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x30e) [0x7f31d8e49dbe] ijl_apply_generic /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(+0x67085) [0x7f31d8e67085] do_call /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(+0x66ac8) [0x7f31d8e66ac8] eval_value /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(+0x67576) [0x7f31d8e67576] eval_body /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(+0x68412) [0x7f31d8e68412] jl_interpret_toplevel_thunk /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(+0x88dcb) [0x7f31d8e88dcb] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(+0x897e9) [0x7f31d8e897e9] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(+0x897e9) [0x7f31d8e897e9] jl_toplevel_eval_flex /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_toplevel_eval_in+0xab) [0x7f31d8e8ac2b] ijl_toplevel_eval_in /opt/hostedtoolcache/julia/1.8.5/x64/lib/julia/sys.so(+0x1462f0f) [0x7f31c5262f0f] julia_exec_options_51760.clone_1.clone_2 /opt/hostedtoolcache/julia/1.8.5/x64/lib/julia/sys.so(+0xf6ab38) [0x7f31c4d6ab38] julia__start_38040.clone_1 /opt/hostedtoolcache/julia/1.8.5/x64/lib/julia/sys.so(+0xf6ac69) [0x7f31c4d6ac69] jfptr__start_38041.clone_1 /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(ijl_apply_generic+0x30e) [0x7f31d8e49dbe] ijl_apply_generic /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(+0xb1067) [0x7f31d8eb1067] true_main /opt/hostedtoolcache/julia/1.8.5/x64/bin/../lib/julia/libjulia-internal.so.1(jl_repl_entrypoint+0x8f) [0x7f31d8eb1aaf] jl_repl_entrypoint /opt/hostedtoolcache/julia/1.8.5/x64/bin/julia(main+0x9) [0x401069] main /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f31d9c29d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f31d9c29e40] __libc_start_main /opt/hostedtoolcache/julia/1.8.5/x64/bin/julia() [0x401099] ```
kleinschmidt commented 1 year ago

yeah, this is the kinda message I've been seeing in the other failed jobs' logs:

[2023-10-10 21:22:40,279 C 6021 6021] id_def.h:25:  Check failed: binary.size() == Size() || binary.size() == 0 expected size is 28, but got data [<AE>^F<D1>e⊇w<BB>
::t<87>D<8D> of size 16

somehow the owner address is still getting mangled...

omus commented 1 year ago

Another failure: https://github.com/beacon-biosignals/Ray.jl/actions/runs/6474941536/job/17580938756?pr=191

omus commented 1 year ago

https://github.com/beacon-biosignals/Ray.jl/actions/runs/6500341448/job/17655529034?pr=200

omus commented 12 months ago

Calling out that it is very likely that #203 fixes this issue however we were unable to reliably reproduce the original CI problem. If this problem is noticed again feel free to re-open this issue.