oscar-system / GAP.jl

GAP packages for Julia integration
https://oscar-system.github.io/GAP.jl/
GNU Lesser General Public License v3.0
57 stars 19 forks source link

`using GAP` fails when julia is using multiple threads #960

Open mjrodgers opened 6 months ago

mjrodgers commented 6 months ago

Edit (@thoma): As a workaround, start julia with --gcthreads=1.

When starting julia using multiple threads (julia -t 4), using GAP can fail. This also seems to leave julia in a fragile state, and I get a segfault when quitting.

Interestingly, using GAP seems to work fine if I launch julia using 3 threads, but 4 is a problem.

I'm running on an Intel Mac running Sonoma 14.2.1, julia 1.10.0, GAP 0.10.1

(base) ➜  ~ julia -t 4
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.0 (2023-12-25)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using GAP
 ┌───────┐   GAP 4.12.2 of 2022-12-18
 │  GAP  │   https://www.gap-system.org
 └───────┘   Architecture: x86_64-apple-darwin14-julia1.10-64-kv8
 Configuration:  gmp 6.2.1, Julia GC, Julia 1.10.0, readline
 Loading the library Error, IS_SUBSET_FLAGS: <flags1> must be a flags list (not a plain list) in
  IS_SUBSET_FLAGS( imp2[1], imp[2]
 ) at /Users/mrodgers/.julia/artifacts/b5c2f0f824457e5c391fb24916f94d5d91c62c4f/share/gap/lib/filter.g:151 called from
InstallTrueMethodNewFilter( tofilt, from
 ); at /Users/mrodgers/.julia/artifacts/b5c2f0f824457e5c391fb24916f94d5d91c62c4f/share/gap/lib/filter.g:298 called from
<function "InstallTrueMethod">( <arguments> )
 called from read-eval loop at /Users/mrodgers/.julia/artifacts/b5c2f0f824457e5c391fb24916f94d5d91c62c4f/share/gap/lib/pcgsspec.gd:253
ERROR: InitError: GAP variable _JULIAINTERFACE_ERROR_BUFFER not bound
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] getproperty(::GAP.GlobalsType, name::Symbol)
    @ GAP ~/.julia/packages/GAP/aJO9M/src/globals.jl:42
  [3] error_handler()
    @ GAP ~/.julia/packages/GAP/aJO9M/src/GAP.jl:72
  [4] initialize(argv::Vector{String})
    @ GAP ~/.julia/packages/GAP/aJO9M/src/GAP.jl:154
  [5] __init__()
    @ GAP ~/.julia/packages/GAP/aJO9M/src/GAP.jl:294
  [6] run_module_init(mod::Module, i::Int64)
    @ Base ./loading.jl:1128
  [7] register_restored_modules(sv::Core.SimpleVector, pkg::Base.PkgId, path::String)
    @ Base ./loading.jl:1116
  [8] _include_from_serialized(pkg::Base.PkgId, path::String, ocachepath::String, depmods::Vector{Any})
    @ Base ./loading.jl:1061
  [9] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt128)
    @ Base ./loading.jl:1575
 [10] _require(pkg::Base.PkgId, env::String)
    @ Base ./loading.jl:1932
 [11] __require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base ./loading.jl:1806
 [12] #invoke_in_world#3
    @ Base ./essentials.jl:921 [inlined]
 [13] invoke_in_world
    @ Base ./essentials.jl:918 [inlined]
 [14] _require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base ./loading.jl:1797
 [15] macro expansion
    @ Base ./loading.jl:1784 [inlined]
 [16] macro expansion
    @ Base ./lock.jl:267 [inlined]
 [17] __require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1747
 [18] #invoke_in_world#3
    @ Base ./essentials.jl:921 [inlined]
 [19] invoke_in_world
    @ Base ./essentials.jl:918 [inlined]
 [20] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1740
during initialization of module GAP

julia>

julia> exit()

[37561] signal (11): Segmentation fault: 11
in expression starting at REPL[2]:1
ExecProccall0args at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
ExecSeqStat at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
EXEC_CURR_FUNC at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
DoExecFunc1args at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
DoOperation1Args at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
DoProperty at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
EvalFunccall1args at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
EvalUnknownBool at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
EvalNot at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
ExecWhile2 at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
ExecSeqStat2 at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
EXEC_CURR_FUNC at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
DoExecFunc0args at /Users/mrodgers/.julia/artifacts/a7bcd955e05e9f268114b41a0606a2fc5e3dbb06/lib/libgap.8.dylib (unknown line)
_call_gap_func at /Users/mrodgers/.julia/packages/GAP/aJO9M/src/ccalls.jl:318 [inlined]
GapObj at /Users/mrodgers/.julia/packages/GAP/aJO9M/src/ccalls.jl:301
unknown function (ip: 0x10250361c)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:3076
#1 at /Users/mrodgers/.julia/packages/GAP/aJO9M/src/GAP.jl:256
unknown function (ip: 0x1025032f1)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:3076
_atexit at ./initdefs.jl:428
jfptr__atexit_79385.1 at /Applications/Julia-1.10.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
ijl_atexit_hook at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/init.c:280
ijl_exit at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/init.c:207
exit at ./initdefs.jl:28 [inlined]
exit at ./initdefs.jl:29
jfptr_exit_79210.1 at /Applications/Julia-1.10.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
do_call at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/interpreter.c:126
eval_body at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/interpreter.c:0
jl_interpret_toplevel_thunk at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/interpreter.c:775
jl_toplevel_eval_flex at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/toplevel.c:934
jl_toplevel_eval_flex at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/toplevel.c:877
jl_toplevel_eval_flex at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/toplevel.c:877
ijl_toplevel_eval at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/toplevel.c:943 [inlined]
ijl_toplevel_eval_in at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/toplevel.c:985
eval at ./boot.jl:385 [inlined]
eval_user_input at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:150
repl_backend_loop at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:246
#start_repl_backend#46 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:231
start_repl_backend at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:228
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:3076
#run_repl#59 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:389
run_repl at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:375
jfptr_run_repl_91808.1 at /Applications/Julia-1.10.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:3076
#1013 at ./client.jl:432
jfptr_YY.1013_82797.1 at /Applications/Julia-1.10.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
jl_f__call_latest at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/builtins.c:812
#invokelatest#2 at ./essentials.jl:887 [inlined]
invokelatest at ./essentials.jl:884 [inlined]
run_main_repl at ./client.jl:416
exec_options at ./client.jl:333
_start at ./client.jl:552
jfptr__start_82823.1 at /Applications/Julia-1.10.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
true_main at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/jlapi.c:582
jl_repl_entrypoint at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-grannysmith-C07ZQ07RJYVY.0/build/default-grannysmith-C07ZQ07RJYVY-0/julialang/julia-release-1-dot-10/src/jlapi.c:731
Allocations: 848880 (Pool: 846702; Big: 2178); GC: 3
[1]    37561 segmentation fault  julia -t 4
ThomasBreuer commented 6 months ago

I cannot reproduce this problem under Ubuntu 20.04 with Julia 1.8.5 and 1.9.0, with various values for the -t command line option. Would it make sense to try other Julia versions with this setup?

mjrodgers commented 6 months ago

I have it working perfectly fine with 1.9.3, it seems to be an issue only with julia 1.10

On Jan 19, 2024, at 12:59 PM, Thomas Breuer @.***> wrote:

I cannot reproduce this problem under Ubuntu 20.04 with Julia 1.8.5 and 1.9.0, with various values for the -t command line option. Would it make sense to try other Julia versions with this setup?

— Reply to this email directly, view it on GitHub https://github.com/oscar-system/GAP.jl/issues/960#issuecomment-1900285584, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANWIWO5Z2YNFV6UXNMWSKDYPJNYXAVCNFSM6AAAAABCBUNQWSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBQGI4DKNJYGQ. You are receiving this because you authored the thread.

benlorenz commented 6 months ago

A workaround to load GAP in a julia session with threads enabled is to disable gcthreads with --gcthreads=1 (this corresponds to how the GC works in 1.9). GC threads are new in julia 1.10 and enabled by default when threads are enabled, and the number of gcthreads is half the number of compute threads. This explains why the problems starts with 4 threads.

The following seems to work fine with julia 1.10:

$ julia-1.10.0 --project=. -t 4 --gcthreads=1
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.0 (2023-12-25)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using GAP
 ┌───────┐   GAP 4.12.2 of 2022-12-18
 │  GAP  │   https://www.gap-system.org
 └───────┘   Architecture: x86_64-pc-linux-gnu-julia1.10-64-kv8
 Configuration:  gmp 6.2.1, Julia GC, Julia 1.10.0, readline
 Loading the library and packages ...
 Packages:   AClib 1.3.2, Alnuth 3.2.1, AtlasRep 2.1.6, AutPGrp 1.11, Browse 1.8.21, CRISP 1.4.6, 
             Cryst 4.1.25, CrystCat 1.1.10, CTblLib 1.3.4, FactInt 1.6.3, FGA 1.4.0, GAPDoc 1.6.6, 
             IRREDSOL 1.4.4, JuliaInterface 0.10.1, LAGUNA 3.9.5, Polenta 1.3.10, Polycyclic 2.16, 
             PrimGrp 3.4.3, RadiRoot 2.9, ResClasses 4.7.3, SmallGrp 1.5.1, Sophus 1.27, SpinSym 1.5.2, 
             TomLib 1.2.9, TransGrp 3.6.3, utils 0.81
 Try '??help' for help. See also '?copyright', '?cite' and '?authors'

Tests also work in this configuration: Pkg.test("GAP", julia_args=["-t 4", "--gcthreads=1"]).

fingolfin commented 6 months ago

Thanks for writing up a report, @mjrodgers I can reproduce it with Julia 1.10.0 on macOS. I will look into this, but possibly only after February 1st, due to the book

fingolfin commented 4 months ago

There are multiple issues here in the GAP-Julia GC integration. One is the now incorrect usage of the variable JuliaTLS in the GAP kernel, which is based on the assumption that there is a single GC job. JuliaTLS is mostly an optimization, and all its uses could simply be replaced by a call to jl_get_ptls_states() -- and also most uses could be avoided by a change to the GAP kernel (basically MarkBag and all related functions would need an extra argument, a void * ref pointer, which is set to the tls pointer when using the Julia GC in GAP and otherwise ignored)

Then there is the use of a bunch of global variables in there (in GAP's src/julia_gc.c to be precise) which is not multi-thread safe, and needs some strategy for that (be it locks, or making them thread local, or ...)

simonbrandhorst commented 3 months ago

Just hit the same bug when installing Oscar and got a segmentation fault. Since this wil be a while until it is fixed, we could check the number of GC threads at startup and raise a useful error?

fingolfin commented 2 months ago

I made some progress on this today: turns out I have to change function ScanTaskStack in the GAP kernel from MarkFromList(jl_get_ptls_states(), stack); to MarkFromList(task->ptls, stack); -- apparently the jl_get_ptls_states is not safe to use in a GC thread.

I also added a mutex to protect access to task_stacks, and enabled REQUIRE_PRECISE_MARKING to completely eliminate a bunch of other global variables (which are problematic in multi threading). With all these changes using GAP works with multiple GC threads (in contrast to stock GAP.jl 0.11.0 using GAP 4.13.0, which immediately hard crashes, i.e., is worse thatn GAP.jl 0.10.0 with GAP 4.12.2).

Alas, there are still issues. So the next thing I did was to add a global GAP list and then any allocation is added to that list, and any GC marks that list -- this should prevent any GAP allocation from ever be garbage collected. (Clearly this is not useful for a production system, but it helps to exclude certain issues).

But despite this, I am still seeing errors that are highly suggestive of GAP objects being GCed prematurely. Hrm.