JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.54k stars 5.47k forks source link

Constructing a 67108864 TiB array succeeds incorrectly, corrupting memory #54328

Open LilithHafner opened 5 months ago

LilithHafner commented 5 months ago
julia> x = Array{UInt128}(undef, 2^3, 2^59)
8×576460752303423488 Matrix{UInt128}:
 0x0000ffff1c9ec7b00000000000000000  0x0000ffff27943b900000000000000000  …  0x0000ffff1fed77c00000000000000003  0x0000ffff1fed58d00000000000000019
 0x0000ffff0e46cc000000ffff2780f340  0x0000ffff0e7426400000ffff0e742650     0x0000ffff2690b1700000ffff27e00038  0x0000ffff273a49900000ffff273a49a0
 0x0000ffff1fed50a00000ffff268f82d0  0x0000ffff2791b3500000000000000000     0x0000ffff1fed60a00000ffff269bb260  0x0000ffff1fed59d00000000000000019
 0x0000ffff268194a80000000000000000  0x0000ffff2690b3100000ffff27f147c8     0x0000ffff2780f3100000ffff2780f320  0x0000ffff277590100000ffff27759020
 0x0000ffff1fed59500000000000000000  0x0000ffff1c47247000003030313a3120     0x0000ffff1fed77c00000000000000003  0x0000ffff1fed59d00000000000000004
 0x0000ffff1c31a0500000ffff1fee6170  0x0000ffff27f147c80000ffff1deb7d90  …  0x0000ffff2690b1b00000ffff27e00508  0x0000ffff277590500000ffff27759060
 0x0000ffff1c9ec7b00000000000000000  0x0000ffff1c59c3000000ffff2690b310     0x0000ffff1fed60a00000000000000000  0x0000ffff1ccde9f00000000000000004
 0x0000ffff0e46cc000000ffff2780f340  0x0000ffff2690b3500000ffff27dec3d0     0x0000ffff273ae1500000ffff273ae160  0x0000ffff2690b2804000000000000000

julia> x[1] = x[end] = 1
1

julia> rand(x)
0x00000000000000000000000000000001

julia> rand(x, 10)
10-element Vector{UInt128}:
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001

julia> length(x
[13983] signal 11 (1): Segmentation fault
in expression starting at REPL[5]:1
gc_try_claim_and_push at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:2118
gc_mark_outrefs at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:2838 [inlined]
gc_mark_loop_serial_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:2963
gc_mark_loop_serial at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:2986
gc_mark_loop at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:3163 [inlined]
_jl_gc_collect at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:3552
ijl_gc_collect at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:3931
maybe_collect at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:922 [inlined]
jl_gc_pool_alloc_inner at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:1319
jl_gc_pool_alloc_noinline at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:1386 [inlined]
jl_gc_alloc_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/julia_internal.h:507 [inlined]
jl_gc_alloc at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:3984
ijl_alloc_svec_uninit at /cache/build/builder-armageddon-1/julialang/julia-master/src/simplevector.c:63
inst_datatype_inner at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:2088
inst_datatype_env at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1357 [inlined]
inst_datatype_env at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1361 [inlined]
inst_datatype_env at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1361 [inlined]
inst_datatype_env at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1361 [inlined]
ijl_apply_type at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1377
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3943
intersect_sub_datatype at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3558 [inlined]
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3962
intersect_sub_datatype at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3558 [inlined]
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3962
intersect_unionall_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3156
intersect_unionall at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3247
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3896
intersect_unionall_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3160
intersect_unionall at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3247
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3870
intersect_tuple at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3468
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3910
intersect_unionall_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3160
intersect_unionall at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3270
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3893
intersect_all at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:4118
intersect_types at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:4177
ijl_has_empty_intersection at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:4188
sort_mlmatches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3483
sort_mlmatches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3503
sort_mlmatches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3503
ml_matches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3911
ml_matches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3687 [inlined]
ijl_matching_methods at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:2309
_methods_by_ftype at ./reflection.jl:1196
complete_methods! at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPLCompletions.jl:856
complete_keyword_argument at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPLCompletions.jl:1016
completions at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPLCompletions.jl:1327
complete_line at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:655
jfptr_complete_line_9919 at /home/x/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_1rXQa.so (unknown line)
check_for_hint at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:383
#139 at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2519
jl_apply at /cache/build/builder-armageddon-1/julialang/julia-master/src/julia.h:2188 [inlined]
jl_f__call_latest at /cache/build/builder-armageddon-1/julialang/julia-master/src/builtins.c:875
#invokelatest#2 at ./essentials.jl:1032 [inlined]
invokelatest at ./essentials.jl:1029 [inlined]
#27 at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:1703
jfptr_YY.27_10520 at /home/x/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_1rXQa.so (unknown line)
prompt! at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2840
run_interface at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2742
jfptr_run_interface_11025 at /home/x/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_1rXQa.so (unknown line)
run_frontend at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:1446
#75 at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:496
)jfptr_YY.75_12396 at /home/x/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_1rXQa.so (unknown line)
jl_apply at /cache/build/builder-armageddon-1/julialang/julia-master/src/julia.h:2188 [inlined]
start_task at /cache/build/builder-armageddon-1/julialang/julia-master/src/task.c:1202
Allocations: 1516744 (Pool: 1516567; Big: 177); GC: 1
Segmentation fault (core dumped)

Related, constructing an array of size 2^61 or 2^62 "works", but not 2^61-1 or 2^62-1:

julia> Vector{Int}(undef, 2^61);

julia> Vector{Int}(undef, 2^62);

julia> Vector{Int}(undef, 2^61-1);
ERROR: ArgumentError: invalid GenericMemory size: too large for system address width
Stacktrace:
 [1] GenericMemory
   @ ./boot.jl:531 [inlined]
 [2] Vector{Int64}(::UndefInitializer, m::Int64)
   @ Core ./boot.jl:596
 [3] top-level scope
   @ REPL[12]:1

julia> Vector{Int}(undef, 2^62-1);
ERROR: ArgumentError: invalid GenericMemory size: too large for system address width
Stacktrace:
 [1] GenericMemory
   @ ./boot.jl:531 [inlined]
 [2] Vector{Int64}(::UndefInitializer, m::Int64)
   @ Core ./boot.jl:596
 [3] top-level scope
   @ REPL[13]:1

I'm guessing all this is due to unchecked overflow when multiplying by sizeof(eltype).

Much if this behavior is also present on 1.10.

CC @oscardssmith who's been working on this recently.

All examples above are from nightly.

oscardssmith commented 5 months ago

That's suboptimal. We probably need one more checked mul on the C side.

oscardssmith commented 5 months ago

hmm, we already are doing the multiplication with wide arithmetic: https://github.com/JuliaLang/julia/blob/c88f4e275af866737491290ccb10d9088c0428ff/src/genericmemory.c#L42 I'm very unsure why this check isn't failing. Is wideint_t somehow being set to uint64?

KristofferC commented 5 months ago

What does "UB" mean here (the UB bug really bit people...). This is just a bug.

KlausC commented 5 months ago

That is how wideint_t is defined (it depends on the architecture and C-compiler in use)

https://github.com/JuliaLang/julia/blob/e637be11f6fad4e45501138090135e4748a6b60e/src/array.c#L19-L23

vtjnash commented 5 months ago

UINT128MAX seems to never exist. What is that supposed to be?

KlausC commented 5 months ago

It is possible to throw if ((size_t)(-1) >> 1 - 1) / max(elsz + (isunion != 0), 1) + 1 > nel or similar in array.c and 2 spots in genericmemory.c. Then size_t prod; prod = (elsz + (isunion != 0)) * nel, without need for wide_t and MAXINTVAL.

vtjnash commented 5 months ago

Ideally we don't need division there, as that is much slower than checking for multiplication overflow

giordano commented 5 months ago

UINT128MAX seems to never exist. What is that supposed to be?

It was introduced by e9be4448fd269bf5915918a2cd272fda03109217 (#5022), but it doesn't seem to be defined/used anywhere else in that revision.

vtjnash commented 5 months ago

Apparently there is no standards-defined way to avoid this UB in C for multiply. But most compilers have extensions (specifically __builtin_mul_overflow) to work around that.

StefanKarpinski commented 5 months ago

Could we move the checking into Julia? Seems we can do this more reliably than C.

oscardssmith commented 5 months ago

Moving the checks to Julia would be a bit dangerous in that anyoen that skips doing the checks could run into weird behavior, but if we give it a bad enough name, it's plausible that people won't try to use it...

LilithHafner commented 5 months ago

What does "UB" mean here

I mean that Array{UInt128}(undef, 2^3, 2^59)[10] semantically invokes LLVM's undefined behavior.

LilithHafner commented 5 months ago

I think it is unlikely for this to result in incorrect results in user code. It is much more likely to result in crashes/segfaults, and it is also unlikely for a user to allocate an array of this size in the first place.