JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.86k stars 5.49k forks source link

Segmentation fault on reinterpret of struct #48491

Closed azreika closed 1 year ago

azreika commented 1 year ago

For the given script:

function foo(::Type{T1}, value::T2) where {T1<:Tuple,T2<:Tuple}
     v = Ref(value)
     GC.@preserve v begin
         ptr = pointer_from_objref(v)
         return Base.unsafe_load(reinterpret(Ptr{T1}, ptr))
     end
 end

struct MyString
   str::String
   b::Bool
end

foo(Tuple{MyString}, (MyString("A", false),))

On master, we get a weird result on the first call to foo, and a segfault when the call is repeated:

julia> foo(Tuple{MyString}, (MyString("A", false),))
(MyString(Core.CodeInstance(MethodInstance for foo(::Type{Tuple{MyString}}, ::Tuple{MyString}), #undef, 0x0000000000001b81, 0xffffffffffffffff, Tuple{MyString}, #undef, UInt8[0x01, 0x00, 0x19, 0x00, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00  …  0x00, 0xb9, 0x3f, 0x07, 0x02, 0x0c, 0x0d, 0x0f, 0x04, 0x00], 0x00000d09, 0x00000d09, nothing, true, true, 0x00, Ptr{Nothing} @0x0000000111506f00, Ptr{Nothing} @0x0000000111506e80), false),)

julia> foo(Tuple{MyString}, (MyString("A", false),))
(MyString(
[96200] signal (11.1): Segmentation fault: 11
in expression starting at none:0
ijl_subtype_env at /Users/azreika/projects/rai/julia/src/subtype.c:1949
jl_tuple1_isa at /Users/azreika/projects/rai/julia/src/subtype.c:2121
jl_typemap_entry_assoc_exact at /Users/azreika/projects/rai/julia/src/typemap.c:974
jl_typemap_assoc_exact at /Users/azreika/projects/rai/julia/src/./julia_internal.h:1465 [inlined]
jl_lookup_generic_ at /Users/azreika/projects/rai/julia/src/gf.c:2813 [inlined]
ijl_apply_generic at /Users/azreika/projects/rai/julia/src/gf.c:2869
_show_default at ./show.jl:479
show_default at ./show.jl:462 [inlined]
show at ./show.jl:457 [inlined]
...

Version info:

julia> versioninfo()
Julia Version 1.10.0-DEV.471
Commit d918576b28 (2023-02-01 19:42 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.6.0)
  CPU: 12 × Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores

Running on Julia 1.8.2 we consistently get a segfault on the first call:

julia> foo(Tuple{MyString}, (MyString("A", false),))
(MyString(
signal (11): Segmentation fault: 11
in expression starting at none:0
ijl_isa at /Applications/Julia-1.8.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.8.dylib (unknown line)
jl_tuple1_isa at /Applications/Julia-1.8.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.8.dylib (unknown line)
jl_typemap_entry_assoc_exact at /Applications/Julia-1.8.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.8.dylib (unknown line)
ijl_apply_generic at /Applications/Julia-1.8.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.8.dylib (unknown line)
_show_default at ./show.jl:413
show_default at ./show.jl:396 [inlined]
show at ./show.jl:391 [inlined]
show_delim_array at ./show.jl:1244
show_delim_array at ./show.jl:1229 [inlined]
show at ./show.jl:1262 [inlined]
show at ./multimedia.jl:47
...

Version info:

julia> versioninfo()
Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: 12 × Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores
azreika commented 1 year ago

@MikeInnes noted that getting rid of the return in foo fixes the problem. This is fine on both master and Julia 1.8.2:

function foo(::Type{T1}, value::T2) where {T1<:Tuple,T2<:Tuple}
    v = Ref(value)
    GC.@preserve v begin
        ptr = pointer_from_objref(v)
        Base.unsafe_load(reinterpret(Ptr{T1}, ptr))
    end
end

struct MyString
    str::String
    b::Bool
 end

foo(Tuple{MyString}, (MyString("A", false),))
julia> foo(Tuple{MyString}, (MyString("A", false),))
(MyString("A", false),)
vtjnash commented 1 year ago

unsafe_load of a Bool is probably undefined (it is only defined for C-compatible types), so this is probably not fixable

MikeInnes commented 1 year ago

If the Bool is replaced by Int32 the issue remains. It does work with Int64, but only because the Ref is then heap allocated. I don't know if the bug is only present for stack-allocated Refs or if it only triggers reliably there.

Specifically, the issue is not that unsafe_load fails (as you might expect) but that it mangles the String pointer (while seemingly preserving the value of the bool/int field).

The use of return prevents the gc_preserve_end expression from making it into the code_typed, which seems like it could be the root of the issue. (But I don't know if the compiler is supposed to handle this case, eg by inferring the preserve_end when the IR contains preserve_begin.)

Looking at the LLVM IR, the failing example will alloca the ref and then memcpy from the struct pointer; when the return statement is present the first memcpy is followed by an @llvm.lifetime.end for the stack-allocated ref. That goes away when return is removed.

Another oddity: Julia will memcpy the ref pointer to an output pointer to store the result of unsafe_load, then again memcpy that to an output pointer, AFAICT. However, with the return present the input to the final memcpy is the pointer for the original ref, not for the output of unsafe_load as you'd expect; so that output is allocated and memcpyd redundantly. This affects both the heap- and stack-allocated Ref cases.

Good LLVM IR (no return statement) ```llvm julia> @code_llvm foo(MyString("A",false)) ; @ test.jl:6 within `foo` define void @julia_foo_4907({ {}*, i8 }* noalias nocapture sret({ {}*, i8 }) %0, [1 x {}*]* noalias nocapture %1, { {}*, i8 }* nocapture nonnull readonly align 8 dereferenceable(16) %2) #0 { top: %3 = alloca [16 x i8], align 16 %.sub = getelementptr inbounds [16 x i8], [16 x i8]* %3, i64 0, i64 0 %4 = call {}*** inttoptr (i64 140703137123557 to {}*** (i64)*)(i64 260) #5 call void @llvm.lifetime.start.p0i8(i64 16, i8* nonnull %.sub) ; @ test.jl:8 within `foo` ; ┌ @ refpointer.jl:134 within `Ref` ; │┌ @ refvalue.jl:10 within `RefValue` @ refvalue.jl:8 %5 = bitcast { {}*, i8 }* %2 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 16 dereferenceable(16) %.sub, i8* noundef nonnull align 8 dereferenceable(16) %5, i64 16, i1 false) ; └└ ; @ test.jl:15 within `foo` ; ┌ @ pointer.jl:105 within `unsafe_load` @ pointer.jl:105 %ptls_field5 = getelementptr inbounds {}**, {}*** %4, i64 2 %6 = bitcast {}*** %ptls_field5 to i8** %ptls_load67 = load i8*, i8** %6, align 8 %7 = call noalias nonnull {}* @ijl_gc_pool_alloc(i8* %ptls_load67, i32 1440, i32 32) #6 %8 = bitcast {}* %7 to i64* %9 = getelementptr inbounds i64, i64* %8, i64 -1 store atomic i64 4689924016, i64* %9 unordered, align 8 %10 = bitcast {}* %7 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(16) %10, i8* noundef nonnull align 16 dereferenceable(16) %.sub, i64 16, i1 false) ; └ %11 = bitcast {}* %7 to {}** %12 = load {}*, {}** %11, align 8 %13 = getelementptr inbounds [1 x {}*], [1 x {}*]* %1, i64 0, i64 0 store {}* %12, {}** %13, align 8 %14 = bitcast { {}*, i8 }* %0 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(16) %14, i8* noundef nonnull align 1 dereferenceable(16) %10, i64 16, i1 false) ret void } ``` Bad LLVM IR (with return statement) ```llvm julia> @code_llvm foo(MyString("A",false)) ; @ test.jl:6 within `foo` define void @julia_foo_4909({ {}*, i8 }* noalias nocapture sret({ {}*, i8 }) %0, [1 x {}*]* noalias nocapture %1, { {}*, i8 }* nocapture nonnull readonly align 8 dereferenceable(16) %2) #0 { top: %3 = alloca [16 x i8], align 16 %.sub = getelementptr inbounds [16 x i8], [16 x i8]* %3, i64 0, i64 0 %4 = call {}*** inttoptr (i64 140703137123557 to {}*** (i64)*)(i64 260) #5 call void @llvm.lifetime.start.p0i8(i64 16, i8* nonnull %.sub) ; @ test.jl:8 within `foo` ; ┌ @ refpointer.jl:134 within `Ref` ; │┌ @ refvalue.jl:10 within `RefValue` @ refvalue.jl:8 %5 = bitcast { {}*, i8 }* %2 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 16 dereferenceable(16) %.sub, i8* noundef nonnull align 8 dereferenceable(16) %5, i64 16, i1 false) call void @llvm.lifetime.end.p0i8(i64 16, i8* nonnull %.sub) ; └└ ; @ test.jl:11 within `foo` ; ┌ @ pointer.jl:105 within `unsafe_load` @ pointer.jl:105 %ptls_field5 = getelementptr inbounds {}**, {}*** %4, i64 2 %6 = bitcast {}*** %ptls_field5 to i8** %ptls_load67 = load i8*, i8** %6, align 8 %7 = call noalias nonnull {}* @ijl_gc_pool_alloc(i8* %ptls_load67, i32 1440, i32 32) #6 %8 = bitcast {}* %7 to i64* %9 = getelementptr inbounds i64, i64* %8, i64 -1 store atomic i64 4689924016, i64* %9 unordered, align 8 %10 = bitcast {}* %7 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 1 dereferenceable(16) %10, i8* noundef nonnull align 16 dereferenceable(16) %.sub, i64 16, i1 false) ; └ %11 = bitcast {}* %7 to {}** %12 = load {}*, {}** %11, align 8 %13 = getelementptr inbounds [1 x {}*], [1 x {}*]* %1, i64 0, i64 0 store {}* %12, {}** %13, align 8 %14 = bitcast { {}*, i8 }* %0 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(16) %14, i8* noundef nonnull align 16 dereferenceable(16) %.sub, i64 16, i1 false) ret void } ```
lucifer1702 commented 1 year ago

Hello , I am willing to contribute to the issue So please assign me to this Issue , I want to know some resources which I can use to solve this Issue . Thanks for your time

gbaraldi commented 1 year ago

The return seems to trigger the segfault because of printing, which I imagine happens because we are observing something bad. @vtjnash is it legal to return from a GC.preserve block?

foo(Tuple{MyString}, (MyString("A", Int32(1)),))
(MyString(
[64706] signal (11.2): Segmentation fault: 11
in expression starting at none:0
sig_match_simple at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
jl_typemap_entry_assoc_exact at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
ijl_apply_generic at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
_show_default at ./show.jl:476
show_default at ./show.jl:459 [inlined]
show at ./show.jl:454 [inlined]
show_delim_array at ./show.jl:1325
show_delim_array at ./show.jl:1310 [inlined]
show at ./show.jl:1343 [inlined]
show at ./multimedia.jl:47
unknown function (ip: 0x1089b4047)
ijl_apply_generic at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
#55 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:273
jfptr_YY.55_60608 at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
with_repl_linfo at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:551
display at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:261 [inlined]
display at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:278 [inlined]
display at ./multimedia.jl:340
jfptr_display_40886 at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
jl_f__call_latest at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
print_response at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:0
#57 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:284
jfptr_YY.57_59645 at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
with_repl_linfo at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:551
print_response at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:282
do_respond at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:893
jfptr_do_respond_60526 at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
jl_f__call_latest at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
#invokelatest#2 at ./essentials.jl:816 [inlined]
invokelatest at ./essentials.jl:813 [inlined]
run_interface at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/LineEdit.jl:2644
jfptr_run_interface_59491 at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
run_frontend at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:1293
#62 at ./task.jl:514
jfptr_YY.62_60256 at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
start_task at /Users/gabrielbaraldi/.julia/juliaup/julia-1.9.0-beta3+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
Allocations: 2990 (Pool: 2979; Big: 11); GC: 0
vtjnash commented 1 year ago

It is legal, though the alloc-opt pass might be unaware of that as it is seen to insert an llvm.lifetime.end after the last semantic use (the pointer_from_objref call)

oscardssmith commented 1 year ago

@lucifer1702 you can try this, but this is probably a really bad issue for a first PR. It involves a lot of really low level stuff.

lucifer1702 commented 1 year ago

@oscardssmith thanks for the advice . I will look for other issues to contribute . I am particularly interested in the field of Graph neural networks and computer vision . Is there any repo which is active that I can contribute to ?. Thanks

oscardssmith commented 1 year ago

You might get better advice on slack/discourse, but there's always stuff to do in the Flux and various autodiff packages

lucifer1702 commented 1 year ago

Thanks for the suggestion

NHDaly commented 1 year ago

Thanks for the fix! ❤️