JuliaInterop / RCall.jl

Call R from Julia
Other
319 stars 59 forks source link

Use with CondaPkg #480

Closed frankier closed 4 months ago

frankier commented 1 year ago

I am trying to use this together with CondaPkg.jl, which seems to provide a slightly nicer way of managing Conda environments with Julia. In order to use the CondaPkg R, I have tried the following:

using CondaPkg
using Pkg

function condapkg_rcall()
    withenv() do
        ENV["R_HOME"] = "$(CondaPkg.envdir())/lib/R"
        Pkg.build("RCall")
    end
end

If I run this RCall will use R installed through CondaPkg. Yay!. Unfortunately this essentially an extra build step. I tried putting it in my project's deps/build.jl, but Pkg isn't available there.

It would be nice if it were possible to actually supply a libR for RCall to use for its embedded R rather than having it hard-coded at build time. This would make things a lot more flexible and would mean it would be possible to write a version of CondaPkg.withenv() which would setup RCall correctly.

frankier commented 1 year ago

One thing that will break using this approach is when moving the project directory the $R_HOME from installation time will no longer be correct. In this case condapkg_rcall() will have to called again.

I'm also a bit unclear on whether it is possible for this RCall that points to one project's CondaPkg to accidentally get reused by another project. One thing that would be a bit more robust than the current environment variable approach would be to use Preferences.jl. I assume this would be an easier change than allowing super late binding of Rhome, which would allow for it to be set from CondaPkg automatically.

frankier commented 1 year ago

Here's what I've currently got as a workaround. Put this in src/FixRCall.jl and try to arrange for it to be called as a script after RCall is first installed. The workaround is long enough now that probably I should try to put a PR together which addresses the issue more directly.

module FixRCall

using CondaPkg
using Pkg

const restart_on_error = false
needs_restart = false

function find_rcall_path()
    for (uuid, info) in Pkg.dependencies()
        if info.name == "RCall"
            return info.source
        end
    end
    return nothing
end

function fix_rcall(explicit)
    global needs_restart
    if needs_restart
        throw("You need to restart your Julia session")
    end
    CondaPkg.resolve()
    target_rhome = "$(CondaPkg.envdir())/lib/R"
    rcall_path = find_rcall_path()
    if rcall_path == nothing
        if explicit
            println(stderr, "RCall not found in the current project!")
        else
            # Do nothing because the user will get a more familiar error as soon as RCall is imported
        end
        return
    end
    need_update = false
    try
        include(rcall_path * "/deps/deps.jl")
        need_update = realpath(target_rhome) != realpath(Rhome)
    catch err
        if err isa SystemError
            # RCall has not been built yet
            need_update = true
        else
            rethrow()
        end
    end
    rcall_has_been_loaded = false
    for mod in Base.loaded_modules_array()
        if string(mod) == "RCall"
            rcall_has_been_loaded = true
        end
    end
    if explicit
        if need_update
            println(stderr, "RCall will be updated.")
        else
            println(stderr, "RCall's R_HOME was already correctly configured. Leaving as is...")
        end
    elseif need_update
        println(stderr, "Looks like RCall is not pointing to the correct R_HOME. This can happen if e.g. you move your project.")
        println(stderr, "RCall will be updated.")
        if rcall_has_been_loaded
            if restart_on_error
                println(stderr, "Because RCall has already been loaded, this script will attempt to restart itself after RCall is updated.")
            else
                println(stderr, "Because RCall has already been loaded, you will need to restart your script when it is updated.")
            end
        end
    end
    if need_update
        ENV["R_HOME"] = target_rhome
        Pkg.build("RCall")
        if rcall_has_been_loaded && !explicit
            if restart_on_error
                argv = Base.julia_cmd().exec
                opts = Base.JLOptions()
                if opts.project != C_NULL
                    push!(argv, "--project=$(unsafe_string(opts.project))")
                end
                if opts.nthreads != 0
                    push!(argv, "--threads=$(opts.nthreads)")
                end
                @ccall execv(argv[1]::Cstring, argv::Ref{Cstring})::Cint
            else
                throw("You need to restart your Julia session")
                needs_restart = true
            end
        end
    end
end

if abspath(PROGRAM_FILE) == @__FILE__
    fix_rcall(true)
else
    fix_rcall(false)
end

end
frankier commented 1 year ago

Alas, this does not work in Julia 1.9. Now it looks like precompiles happen in a "/tmp" directory, so including src/FixRCall.jl from the module will set R_HOME incorrectly.

I'm not sure what the best solution is. Probably reverting to using this as a manual script will work ok.

frankier commented 1 year ago

After applying this PR all that's needed is to set things up in LocalPreference.toml in the appropriate way. Here is a script to do it automatically:

using CondaPkg
using Preferences
using Libdl
using PreferenceTools

"""
    validate_libR(libR)

Checks that the R library `libR` can be loaded and is satisfies version requirements.

"""
function validate_libR(libR)
    if !isfile(libR)
        error("Could not find library $libR. Make sure that R shared library exists.")
    end
    # Issue #143
    # On linux, sometimes libraries linked from libR (e.g. libRblas.so) won't open unless LD_LIBRARY_PATH is set correctly.
    libptr = try
        Libdl.dlopen(libR)
    catch er
        Base.with_output_color(:red, stderr) do io
            print(io, "ERROR: ")
            showerror(io, er)
            println(io)
        end
        @static if Sys.iswindows()
            error("Try adding $(dirname(libR)) to the \"PATH\" environmental variable and restarting Julia.")
        else
            error("Try adding $(dirname(libR)) to the \"LD_LIBRARY_PATH\" environmental variable and restarting Julia.")
        end
    end
    # R_tryCatchError is only available on v3.4.0 or later.
    if Libdl.dlsym_e(libptr, "R_tryCatchError") == C_NULL
        error("R library $libR appears to be too old. RCall.jl requires R 3.4.0 or later.")
    end
    Libdl.dlclose(libptr)
    return true
end

function locate_libR(Rhome)
    @static if Sys.iswindows()
        libR = joinpath(Rhome, "bin", Sys.WORD_SIZE==64 ? "x64" : "i386", "R.dll")
    else
        libR = joinpath(Rhome, "lib", "libR.$(Libdl.dlext)")
    end
    validate_libR(libR)
    return libR
end

CondaPkg.resolve()
target_rhome = "$(CondaPkg.envdir())/lib/R"
PreferenceTools.add(
    "RCall",
    "Rhome" => target_rhome,
    "libR" => locate_libR(target_rhome)
)
ParadaCarleton commented 1 year ago

After applying this PR all that's needed is to set things up in LocalPreference.toml in the appropriate way. Here is a script to do it automatically:

using CondaPkg
using Preferences
using Libdl
using PreferenceTools

"""
    validate_libR(libR)

Checks that the R library `libR` can be loaded and is satisfies version requirements.

"""
function validate_libR(libR)
    if !isfile(libR)
        error("Could not find library $libR. Make sure that R shared library exists.")
    end
    # Issue #143
    # On linux, sometimes libraries linked from libR (e.g. libRblas.so) won't open unless LD_LIBRARY_PATH is set correctly.
    libptr = try
        Libdl.dlopen(libR)
    catch er
        Base.with_output_color(:red, stderr) do io
            print(io, "ERROR: ")
            showerror(io, er)
            println(io)
        end
        @static if Sys.iswindows()
            error("Try adding $(dirname(libR)) to the \"PATH\" environmental variable and restarting Julia.")
        else
            error("Try adding $(dirname(libR)) to the \"LD_LIBRARY_PATH\" environmental variable and restarting Julia.")
        end
    end
    # R_tryCatchError is only available on v3.4.0 or later.
    if Libdl.dlsym_e(libptr, "R_tryCatchError") == C_NULL
        error("R library $libR appears to be too old. RCall.jl requires R 3.4.0 or later.")
    end
    Libdl.dlclose(libptr)
    return true
end

function locate_libR(Rhome)
    @static if Sys.iswindows()
        libR = joinpath(Rhome, "bin", Sys.WORD_SIZE==64 ? "x64" : "i386", "R.dll")
    else
        libR = joinpath(Rhome, "lib", "libR.$(Libdl.dlext)")
    end
    validate_libR(libR)
    return libR
end

CondaPkg.resolve()
target_rhome = "$(CondaPkg.envdir())/lib/R"
PreferenceTools.add(
    "RCall",
    "Rhome" => target_rhome,
    "libR" => locate_libR(target_rhome)
)

Should this be added to the package so we can use CondaPkg automatically?

frankier commented 1 year ago

@ParadaCarleton I was thinking I would probably open another PR with it as an extension package once the first is merged. There might still need to be an extra step but that can be figured out later. The current PR is a more generic one to enable configuration through LocalPreferences.json, which could be nice even without CondaPkg for helping with issues like https://github.com/JuliaInterop/RCall.jl/issues/492

The full current process is as follows from https://github.com/cjdoris/CondaPkg.jl/issues/100 :

The first step is to try this PR from RCall: JuliaInterop/RCall.jl#496

pkg> add "https://github.com/frankier/RCall.jl.git#preferences-r-installation"

The second is to run this (you will have to add Libdl + PreferenceTools to your project for now)

using CondaPkg
using Preferences
using Libdl
using PreferenceTools

function locate_libR(Rhome)
    @static if Sys.iswindows()
        libR = joinpath(Rhome, "bin", Sys.WORD_SIZE==64 ? "x64" : "i386", "R.dll")
    else
        libR = joinpath(Rhome, "lib", "libR.$(Libdl.dlext)")
    end
    return libR
end

CondaPkg.resolve()
target_rhome = "$(CondaPkg.envdir())/lib/R"
PreferenceTools.add(
    "RCall",
    "Rhome" => target_rhome,
    "libR" => locate_libR(target_rhome)
)

e.g. put it in a file setup_rcondapkg.jl and run as part of your top level project julia --project=. setup_rcondapkg.jl. Then you should be able to add this to CondaPkg.toml

[deps]
r-mypkg = ""
r = ""

In case you try this and it works for you, could you please mention that it's helpful in the PR? Some social proof might give encourage the maintainers to merge it.

palday commented 4 months ago

@frankier I think your work in #496 (including the documentation) largely handles this, so I'm closing for now. If you can think of a good way to make CondaPkg even more convenient (and perhaps provide a better resolution to #509), then I'm happy to support a PR in any way I can!