JuliaInterop / RCall.jl

Call R from Julia
Other
319 stars 59 forks source link

Segmentation fault on closing julia #511

Closed schlichtanders closed 2 months ago

schlichtanders commented 10 months ago

Hello, I just want to report that I run into a segmentation fault when closing Julia again (aftr using RCall).

[1847] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /usr/local/julia/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
unknown function (ip: 0x7f95e9ce41c9)
__libc_start_main at /lib/x86_[64](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:65)-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 32734049 (Pool: 32[70](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:71)1[76](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:77)1; Big: 32288); GC: 42

I have no minimal example, but I guess it could have something todo with me using an R function inside an async julia task, which somehow is not correctly finalized or prevents some other part from finalizing.

The above error occurs on a docker container build on top julia:1.9, while when I run it on my local laptop, the same code does not throw an error, but hangs infinitely.

schlichtanders commented 10 months ago

A similar segmentation fault was already reported to julialang https://github.com/JuliaLang/julia/issues/43556

it is about that switching tasks and calling Base.iolock_end() don't work well together , but I couldn't find iolock_end. Maybe some related function is nevertheless called, or some similar unexpected task switching happens.

schlichtanders commented 10 months ago

I was able to replicate the segmentation fault it is combination out of three components:

Everything is fine until the julia session is closed - then the same segmentation fault is thrown

julia> struct SingletonType end

julia> Singleton=SingletonType()
SingletonType()

julia> using RCall

R> library(JuliaCall)

R> r_singleton = julia_eval("Singleton")
┌ Warning: RCall.jl: Julia version 1.9.3 at location /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/bin will be used.
│ Loading setup script for JuliaCall...
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172
┌ Warning: RCall.jl: Finish loading setup script for JuliaCall.
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172

julia> rf = reval("""function(){
               if (runif(1) > 0.9){
                       r_singleton
               } else {
                       rnorm(1)
               }
       }""")
RObject{ClosSxp}
function () 
{
    if (runif(1) > 0.9) {
        r_singleton
    }
    else {
        rnorm(1)
    }
}
julia> rf()
RObject{RealSxp}
[1] 1.055938

julia> 

[2389910] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 8519504 (Pool: 8511578; Big: 7926); GC: 13

[2389910] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 8519504 (Pool: 8511578; Big: 7926); GC: 13
[1]    2389910 segmentation fault (core dumped)  julia --project
palday commented 10 months ago

Does this also happen when you start R directly and not via RCall? IIRC JuliaCall works by creating a latent Julia session and then opening RCall within that nested session. I don't know what happens when that JuliaCall session is already nested in an RCall session...

schlichtanders commented 10 months ago

I will test soon, whether I can circumvent this by starting it via R directly.

I further simplified the failing example - it is only about getting some julia value to R. Boom.

julia> using RCall

R> library(JuliaCall)

R> ftype = julia_eval("Function")
┌ Warning: RCall.jl: Julia version 1.9.3 at location /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/bin will be used.
│ Loading setup script for JuliaCall...
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172
┌ Warning: RCall.jl: Finish loading setup script for JuliaCall.
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172

julia> 

[2406346] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 7218572 (Pool: 7211588; Big: 6984); GC: 10

[2406346] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 7218572 (Pool: 7211588; Big: 6984); GC: 10
schlichtanders commented 10 months ago

Does this also happen when you start R directly and not via RCall? IIRC JuliaCall works by creating a latent Julia session and then opening RCall within that nested session. I don't know what happens when that JuliaCall session is already nested in an RCall session...

A first try fails because I cannot find how to use a certain julia environment via JuliaCall. When first starting julia and then using RCall, it picks up the same julia session, in standalone I couldn't find any documentation about it.

EDIT: I found it. You need to set environment variable JULIA_PROJECT="..."

schlichtanders commented 10 months ago

I tested the examples now and it seems to work without Segfault if it is directly started via R. Looks like a good workaround for me.

Still, it is natural to expect that JuliaCall works inside RCall. In the python world PythonCall and JuliaCall also work together. It would be great if this Segfault could be solved. It is only the final exiting of julia - everything else works already.

palday commented 10 months ago

I know this has been my mantra lately ... but I'm wondering if JuliaCall needs to check to see whether there's an existing RCall session before creating a new one. (Why do I think it's JuliaCall's responsibility and not RCall's? Because JuliaCall depends on RCall but not vice versa. If there were a straightforward change we could make in RCall to make this easier, I would support it, but big changes in RCall tend to get stuck by very limited maintainer bandwidth.)