Clemapfel / jluna

Julia Wrapper for C++ with Focus on Safety, Elegance, and Ease of Use
https://clemens-cords.com/jluna
MIT License
239 stars 12 forks source link

ctest --verbose fails when built with clang++-14 #25

Closed paulerikf closed 1 year ago

paulerikf commented 2 years ago

Note: This issue occurs for me when compiling with clang++-14, but not with g++-11.

Working my way through the install instructions, however, ctest --verbose fails after make install.

The test seems to segfault during unsafe: resize_array: reshape at unsafe::resize_array(arr, 5, 5);

signal (11): Segmentation fault
in expression starting at none:0
...

"Segmentation fault in expression starting at none:0" are mentioned in the troubleshooting guide, but specifically in a multithreading context.

Additional details: Ubuntu 20.04 Using -DCMAKE_CXX_COMPILER=clang++-14 (Note: ctests pass when built with g++-11)

Full ctest --verbose output:

UpdateCTestConfiguration  from :/home/frivold/Code/jluna/build/DartConfiguration.tcl
Parse Config file:/home/frivold/Code/jluna/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :/home/frivold/Code/jluna/build/DartConfiguration.tcl
Parse Config file:/home/frivold/Code/jluna/build/DartConfiguration.tcl
Test project /home/frivold/Code/jluna/build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
    Start 1: jluna_test

1: Test command: /home/frivold/Code/jluna/build/jluna_test
1: Test timeout computed to be: 1500
1: [JULIA][LOG] initialization successful (1 thread(s)).
1: starting test...
1:
1: c_adapter found: [OK]
1: unsafe: gc_push / gc_pop: [OK]
1: unsafe: gc: [OK]
1: unsafe: _sym: [OK]
1: as_julia_pointer: [OK]
1: unsafe: get_function: [OK]
1: unsafe: Expr & eval: [OK]
1: unsafe: get/set value: [OK]
1: unsafe: get_field: [OK]
1: unsafe: set_field: [OK]
1: unsafe: call: [OK]
1: unsafe: new_array: [OK]
1: unsafe: new_array_from_data: [OK]
1: unsafe: override_array: [OK]
1: unsafe: swap_array_data: [OK]
1: unsafe: set_array_data: [OK]
1:
1: signal (11): Segmentation fault
1: in expression starting at none:0
1: set_nth_field at /buildworker/worker/package_linux64/build/src/datatype.c:1498 [inlined]
1: jl_new_struct at /buildworker/worker/package_linux64/build/src/datatype.c:1251
1: _ZN5jluna6unsafe12resize_arrayEP10jl_array_tmm at /home/frivold/Code/jluna/install/libjluna.so.0.9.1 (unknown line)
1: _ZZ4mainENK4$_19clEv at /home/frivold/Code/jluna/build/jluna_test (unknown line)
1: _ZN5jluna6detail4Test4testIZ4mainE4$_19EEvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEOT_ at /home/frivold/Code/jluna/build/jluna_test (unknown line)
1: main at /home/frivold/Code/jluna/build/jluna_test (unknown line)
1: __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
1: _start at /home/frivold/Code/jluna/build/jluna_test (unknown line)
1: Allocations: 1724877 (Pool: 1723947; Big: 930); GC: 3
1/1 Test #1: jluna_test .......................***Exception: SegFault  0.92 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.92 sec

The following tests FAILED:
          1 - jluna_test (SEGFAULT)
Errors while running CTest
Output from these tests are in: /home/frivold/Code/jluna/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
Clemapfel commented 2 years ago

I have a hunch this is unrelated to the compiler, as I cannot reproduce the crash on clang-14, however if we force full garbage collection at this line:

jl_array_t * arr = (jl_array_t*) jl_eval_string("return [i for i in 1:(5*5*5)]");
jl_gc_collect(JL_GC_FULL); // new
unsafe::resize_array(arr, 5, 5, 5);
Test::assert_that(jl_array_ndims(arr) == 3);

It crashes with the same error, regardless of the compiler.

I'm pretty sure what's happening is that during that test, there's is a small (random?) chance for the Julia GC to trigger collection of the array in between the lines:

jl_array_t * arr = (jl_array_t*) jl_eval_string("return [i for i in 1:(5*5*5)]");
// here
unsafe::resize_array(arr, 5, 5, 5);

since during the duration or resize_array and the assert it makes sure that the GC cannot touch arr.

This means it's unrelated to any library function and more a problem in the test itself, I'll put in manual safeguards to make sure there's a 0% chance for arr to get sniped, I don't think this is a bug though.

Could you try running the same test (compiled with clang14) multiple times and see if it crashes at the same part everytime? Also what Julia version are you on?

Thank you for bringing this up, though!

paulerikf commented 1 year ago

Thanks for looking into this!

Hmm... I'm not sure what's going on then. Reran the test a bunch of times, and it is consistently crashing at the same point every single time.

I'm running Julia version 1.7.1. Not sure if this is relevant, but I ended up uninstalling gcc-11/g++-11 because it was causing issues in other places, so clang is using gcc-9 internally.

I have been using jluna compiled with clang-14 without problems since then, although I haven't messed around with the unsafe functions yet.

Final note: If I comment out the specific unsafe: resize_array: reshape test that fails, all the other tests pass every time.

Anyway, weird! Thanks again for looking into this. Please let me know if there's any other info you need or anything I can try to help debug this.

Clemapfel commented 1 year ago

lmk if this still occurs, I'll reopen the issue

paulerikf commented 1 year ago

Still happening, unfortunately. Seems to be breaking at the same spot too (resize_array).

Full output:

UpdateCTestConfiguration  from :/home/frivold/Code/jluna/build/DartConfiguration.tcl
Parse Config file:/home/frivold/Code/jluna/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :/home/frivold/Code/jluna/build/DartConfiguration.tcl
Parse Config file:/home/frivold/Code/jluna/build/DartConfiguration.tcl
Test project /home/frivold/Code/jluna/build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
    Start 1: jluna_test

    1: Test command: /home/frivold/Code/jluna/build/jluna_test
    1: Test timeout computed to be: 1500
    1: [JULIA][LOG] initialization successful (1 thread(s)).
    1: starting test...
    1:
    1: c_adapter found: [OK]
    1: unsafe: gc_push / gc_pop: [OK]
    1: unsafe: gc: [OK]
    1: unsafe: _sym: [OK]
    1: as_julia_pointer: [OK]
    1: unsafe: get_function: [OK]
    1: unsafe: Expr & eval: [OK]
    1: unsafe: get/set value: [OK]
    1: unsafe: get_field: [OK]
    1: unsafe: set_field: [OK]
    1: unsafe: call: [OK]
    1: unsafe: new_array: [OK]
    1: unsafe: new_array_from_data: [OK]
    1: unsafe: override_array: [OK]
    1: unsafe: swap_array_data: [OK]
    1: unsafe: set_array_data: [OK]
    1:
    1: signal (11): Segmentation fault
    1: in expression starting at none:0
    1: set_nth_field at /buildworker/worker/package_linux64/build/src/datatype.c:1498 [inlined]
    1: jl_new_struct at /buildworker/worker/package_linux64/build/src/datatype.c:1251
    1: _ZN5jluna6unsafe12resize_arrayEP10jl_array_tmm at /home/frivold/Code/jluna/install/libjluna.so.0.9.1 (unknown line)
    1: _ZZ4mainENK4$_19clEv at /home/frivold/Code/jluna/build/jluna_test (unknown line)
    1: _ZN5jluna6detail4Test4testIZ4mainE4$_19EEvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEOT_ at /home/frivold/Code/jluna/build/jluna_test (unknown line)
    1: main at /home/frivold/Code/jluna/build/jluna_test (unknown line)
    1: __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
    1: _start at /home/frivold/Code/jluna/build/jluna_test (unknown line)
    1: Allocations: 1724878 (Pool: 1723949; Big: 929); GC: 3
    1/1 Test #1: jluna_test .......................***Exception: SegFault  0.89 sec

    0% tests passed, 1 tests failed out of 1

    Total Test time (real) =   0.89 sec

    The following tests FAILED:
              1 - jluna_test (SEGFAULT)
              Errors while running CTest
              Output from these tests are in: /home/frivold/Code/jluna/build/Testing/Temporary/LastTest.log
              Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.