JuliaInterop / libcxxwrap-julia

C++ library for backing CxxWrap.jl
Other
83 stars 43 forks source link

Crash in Julia 1.5 and later #75

Open fingolfin opened 3 years ago

fingolfin commented 3 years ago

(This comes from comments on https://github.com/JuliaPackaging/Yggdrasil/issues/2160 but I think it deserves its own full issue):

There are at least two issues with libcxxwrap_julia_jll v0.8.5 in Julia 1.5 and later

  1. It fails to load on macOS with error Symbol not found: __Unwind_Resume ; this is addressed in https://github.com/JuliaPackaging/Yggdrasil/pull/2190 ; see also https://github.com/JuliaPackaging/Yggdrasil/pull/2199 and

  2. On Linux there is a segfault; this segfault can also be seen on this repo in the nightly CI tests, which fail with the same error (the CI tests here only test with Julia 1.4 and nightly, it might be useful to also test 1.5 there?). This is a backtrace:

    
    ...
    Running tests from containers.jl...

signal (11): Segmentation fault in expression starting at /home/mhorn/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:21 _ZN5jlcxx6detail11CallFunctorINS_10ConstArrayIdLl1EEEJEE5applyEPKv at /home/mhorn/.julia/artifacts/860a8b2216bd059600ed7c44cdaa3bb81b23ff1c/lib/libjlcxx_containers.so (unknown line) const_vector at /home/mhorn/.julia/packages/CxxWrap/ZOkSN/src/CxxWrap.jl:590 macro expansion at /home/mhorn/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:29 [inlined] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115 [inlined] top-level scope at /home/mhorn/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:23 jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:834 jl_parse_eval_all at /buildworker/worker/package_linux64/build/src/ast.c:913 jl_load_rewrite at /buildworker/worker/package_linux64/build/src/toplevel.c:914 include at ./client.jl:457 _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2214 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398 jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1690 [inlined] do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:117 eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:206 eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:157 [inlined] eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:566 eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:492 eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:492 jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:660 jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:840 jl_parse_eval_all at /buildworker/worker/package_linux64/build/src/ast.c:913 jl_load_rewrite at /buildworker/worker/package_linux64/build/src/toplevel.c:914 include at ./client.jl:457 _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2231 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398 jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1690 [inlined] do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:117 eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:206 eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:157 [inlined] eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:566 jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:660 jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:840 jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:790 jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:883 eval at ./boot.jl:331 _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2214 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398 exec_options at ./client.jl:272 _start at ./client.jl:506 jfptr__start_53898.clone_1 at /home/mhorn/julia-1.5.3/lib/julia/sys.so (unknown line) _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2214 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398 jl_apply at /buildworker/worker/package_linux64/build/ui/../src/julia.h:1690 [inlined] true_main at /buildworker/worker/package_linux64/build/ui/repl.c:106 main at /buildworker/worker/package_linux64/build/ui/repl.c:227 __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) _start at /home/mhorn/julia-1.5.3/bin/julia (unknown line) Allocations: 16112006 (Pool: 16107521; Big: 4485); GC: 15


Applying `c++filt` to `_ZN5jlcxx6detail11CallFunctorINS_10ConstArrayIdLl1EEEJEE5applyEPKv` gives `jlcxx::detail::CallFunctor<jlcxx::ConstArray<double, 1l>>::apply(void const*)`.

It might be helpful to attack this with `rr`, but my early attempts failed due to a lack of instructions, and then I had to work on other stuff. 
benlorenz commented 3 years ago

I did some debugging but I still don't know why that happens but it does offer some workarounds, i.e. setting -DCMAKE_CXX_FLAGS_RELEASE="-O2".

I tried several configurations to get a better backtrace and could reproduce that crash only in some rather specific cases:

And with that -O3 Debug option I got a slightly better backtrace:

#0  0x00007fffda2768cf in jlcxx::ConvertToJulia<jlcxx::ConstArray<double, 1l>, jlcxx::ConstArrayTrait>::operator() (arr=..., this=<optimized out>)
    at /workspace/srcdir/libcxxwrap-julia/include/jlcxx/const_array.hpp:95
#1  jlcxx::convert_to_julia<jlcxx::ConstArray<double, 1l> > (cpp_val=...) at /workspace/srcdir/libcxxwrap-julia/include/jlcxx/type_conversion.hpp:745
#2  jlcxx::detail::ReturnTypeAdapter<jlcxx::ConstArray<double, 1l>>::operator()(void const*) (this=<optimized out>, functor=<optimized out>)
    at /workspace/srcdir/libcxxwrap-julia/include/jlcxx/module.hpp:47
#3  jlcxx::detail::CallFunctor<jlcxx::ConstArray<double, 1l>>::apply(void const*) (functor=<optimized out>) at /workspace/srcdir/libcxxwrap-julia/include/jlcxx/module.hpp:72

const_array.hpp:95 is a JL_GC_POP() call but that it only happens with -O3 points to something rather annoying to debug.

So far for today, maybe someone else wants to have a look again.

fingolfin commented 3 years ago

@benlorenz thanks for that, that helps a lot. So perhaps we can just rebuild this with GCC 8 or 9. It may "just" be a compiler bug, after all.

fingolfin commented 3 years ago

I made https://github.com/JuliaPackaging/Yggdrasil/pull/2236 let's see if that helps

fingolfin commented 3 years ago

That indeed seems to have fixed it, great! Now I guess the CI in this repository should be switched to use GCC 8+, too?

barche commented 3 years ago

Alright, great news that it works with the newer GCC, but I'm still quite nervous that this is a bug in libcxxwrap-julia. We'll see if it resurfaces elsewhere somehow, for now this seems like a good solution, thanks for the help!