JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.63k stars 5.48k forks source link

Segmentation fault with `vcat` on a `Vector{Any}` #45513

Open etiennedeg opened 2 years ago

etiennedeg commented 2 years ago

Context: https://github.com/JuliaGraphs/Graphs.jl/issues/138

MWE:

Edit: smaller reproducer:

l = Any[[1] for i in 1:6000]
for i in 1:2:6000
   l[i] = []
end
vcat(l...)

original reproducer: Download the file https://pastebin.com/hfv6Jrgv Convert it to LF instead of CRLF if needed.

using Graphs, GraphIO, ParserCombinator
loadgraph("path-to-awk", "code", GraphIO.DOT.DOTFormat())

Result in segmentation fault:

signal (11): Erreur de segmentation
in expression starting at none:1
subtype_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:796
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1260
subtype_tuple_tail at /buildworker/worker/package_linux64/build/src/subtype.c:1079 [inlined]
subtype_tuple at /buildworker/worker/package_linux64/build/src/subtype.c:1158 [inlined]
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1299
exists_subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1395 [inlined]
forall_exists_subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1423
jl_subtype_env at /buildworker/worker/package_linux64/build/src/subtype.c:1878
subtype_tuple_tail at /buildworker/worker/package_linux64/build/src/subtype.c:1076 [inlined]
subtype_tuple at /buildworker/worker/package_linux64/build/src/subtype.c:1158 [inlined]
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1299
subtype_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:774
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1260
exists_subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1395 [inlined]
forall_exists_subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1423
jl_subtype_env at /buildworker/worker/package_linux64/build/src/subtype.c:1878
jl_type_intersection_env_s at /buildworker/worker/package_linux64/build/src/subtype.c:3394
jl_typemap_intersection_node_visitor at /buildworker/worker/package_linux64/build/src/typemap.c:459
jl_typemap_intersection_visitor at /buildworker/worker/package_linux64/build/src/typemap.c:626
ml_matches at /buildworker/worker/package_linux64/build/src/gf.c:2761
_gf_invoke_lookup at /buildworker/worker/package_linux64/build/src/gf.c:2440 [inlined]
jl_mt_assoc_by_type at /buildworker/worker/package_linux64/build/src/gf.c:1195
jl_lookup_generic_ at /buildworker/worker/package_linux64/build/src/gf.c:2400 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2425
typed_vcat at ./abstractarray.jl:1619
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_apply at /buildworker/worker/package_linux64/build/src/builtins.c:713
vcat at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/SparseArrays/src/sparsevector.jl:1123
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_apply at /buildworker/worker/package_linux64/build/src/builtins.c:713
edges at /home/etienne/.julia/packages/ParserCombinator/35p4P/src/dot/DOT.jl:234
_dot_read_one_graph at /home/etienne/.julia/packages/GraphIO/x37Ru/src/DOT/Dot.jl:53
loaddot at /home/etienne/.julia/packages/GraphIO/x37Ru/src/DOT/Dot.jl:68
loadgraph at /home/etienne/.julia/packages/GraphIO/x37Ru/src/DOT/Dot.jl:83 [inlined]
#120 at /home/etienne/.julia/dev/Graphs/src/persistence/common.jl:15 [inlined]
#open#355 at ./io.jl:330
open at ./io.jl:328 [inlined]
loadgraph at /home/etienne/.julia/dev/Graphs/src/persistence/common.jl:14
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:126
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:215
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:166 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:587
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:731
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:885
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:944
eval at ./boot.jl:373 [inlined]
repleval at /home/etienne/.julia/packages/Atom/bfwsW/src/repl.jl:198
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
#258 at /home/etienne/.julia/packages/Atom/bfwsW/src/repl.jl:228
unknown function (ip: 0x7fad4a9ed7bf)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
with_logstate at ./logging.jl:511
with_logger at ./logging.jl:623 [inlined]
evalrepl at /home/etienne/.julia/packages/Atom/bfwsW/src/repl.jl:216
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:126
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:215
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:166 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:587
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:731
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:885
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:830
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:944
eval at ./boot.jl:373 [inlined]
eval_user_input at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:150
repl_backend_loop at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:246
start_repl_backend at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:231
#run_repl#47 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:364
run_repl at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:351
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
#936 at ./client.jl:394
jfptr_YY.936_35454 at /home/etienne/.julia/juliaup/julia-1.7.3+0~x64/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
jl_f__call_latest at /buildworker/worker/package_linux64/build/src/builtins.c:757
#invokelatest#2 at ./essentials.jl:716 [inlined]
invokelatest at ./essentials.jl:714 [inlined]
run_main_repl at ./client.jl:379
exec_options at ./client.jl:309
_start at ./client.jl:495
jfptr__start_22567 at /home/etienne/.julia/juliaup/julia-1.7.3+0~x64/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
true_main at /buildworker/worker/package_linux64/build/src/jlapi.c:559
jl_repl_entrypoint at /buildworker/worker/package_linux64/build/src/jlapi.c:701
main at /buildworker/worker/package_linux64/build/cli/loader_exe.c:42
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at /home/etienne/.julia/juliaup/julia-1.7.3+0~x64/bin/julia (unknown line)
Allocations: 40888482 (Pool: 40880734; Big: 7748); GC: 36

Version info:

Julia Version 1.7.3
Commit 742b9abb4d (2022-05-06 12:58 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU           E5606  @ 2.13GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, westmere)
Environment:
  JULIA_EDITOR = atom  -a
  JULIA_NUM_THREADS = 4
  JULIA_PKG_PRECOMPILE_AUTO = 0
mcabbott commented 2 years ago

In case this helps, some variants. (Xref also https://github.com/JuliaLang/julia/issues/45454 recently about stack overflow from a splat.)

julia> let
        x = Any[[1] for _ in 1:10^4]
        # x[2] = 2  # without this, no problem
        vcat(x...)'
       end
1×10000 adjoint(::Vector{Int64}) with eltype Int64:
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  …  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

julia> let
        x = Any[[1] for _ in 1:10^4]
        x[2] = 2  # change near the start
        vcat(x...)'
       end
ERROR: StackOverflowError:

julia> let
        x = Any[[1] for _ in 1:10^4]
        x[end-2] = 2  # change near the end
        vcat(x...)'
       end

signal (11): Segmentation fault: 11
in expression starting at REPL[9]:1
gc_read_stack at /Users/me/.julia/dev/julia/src/gc.c:1696 [inlined]
gc_mark_loop at /Users/me/.julia/dev/julia/src/gc.c:2373
KristofferC commented 2 years ago

This just seems like the old issue of splitting way too big argument lists. Use reduce(vcat, l) instead.

Dup of https://github.com/JuliaLang/julia/issues/42327, https://github.com/JuliaLang/julia/issues/38364

jakobnissen commented 2 years ago

Until the issue is fixed, perhaps it would be best to find some lower bound of N arguments, where it can be guaranteed to not crash, and then add a check that explicitly errors with an informative error message when a function is called with more than N arguments.

The current situation is both confusing, because it's difficult to understand what happened, and also brittle, because the fact that it sometimes works can cause developers to rely on it working, and then introducing code that will suddenly break in the user's hands. For example, there is no problem here:

julia> f(args...) = sum(args)
f (generic function with 1 method)

julia> f(1:10000...)
50005000

Python limited function to 255 arguments until a few years ago, and I haven't heard about anyone complaining about that. Conversely, if people do begin complaining that Julia suddenly can't handle them splatting 500 elements which they are used to doing, then it's a pretty good sign that Julia's poor handling of lots of arguments is something that is already causing issues.

vtjnash commented 2 years ago

Note that all of these fail in subtyping with

julia: /data/vtjnash/julia/src/subtype.c:133: statestack_set: Assertion `i >= 0 && i < sizeof(st->stack) * 8' failed.