Closed mbstadler closed 8 months ago
Another, simpler, solution might be clusterEvalQ()
, which evaluates an expression on the remote cluster. So
proc = makePSOCKcluster(1)
## local process id
Sys.getpid()
## remote process id
clusterEvalQ(proc, Sys.getpid())[[1]]
So just
clusterEvalQ(
proc,
library("reticulate", character.only=TRUE, lib.loc = file.path(R.home(), "library"))
)
There is a little subtlety involved here -- presumably R.home()
is the home directory of the spawned process, and not of the parent process.
I wonder what changed (in R?) given that it was working fine before.
At any rate: it seems that, to be completely safe, we would have to "sanitize" the environment for every function we send over to the fallback process. This isn't just limited to the library
call, e.g., all assigner
calls, the basilisk.utils:::.activate_condaenv
call, the no-op function here, and the final calls to basiliskStart
and .instantiate_store
. So we would do something like:
envstripper <- function(fun) {
environment(fun) <- .GlobalEnv
fun
}
clusterCall(proc, envstripper(function() {
library("reticulate", character.only=TRUE, lib.loc = file.path(R.home(), "library"))
}))
assigner <- envstripper(function(name, value) assign(name, value, envir=.GlobalEnv))
clusterCall(proc, assigner, name=".activate_condaenv", value=envstripper(basilisk.utils:::.activate_condaenv))
clusterCall(proc, assigner, name="isWindows", value=isWindows)
clusterCall(proc, envstripper(activateEnvironment), envpath=envpath, loc=getCondaDir())
clusterCall(proc, assigner, name="activateEnvironment", value=list) # no-op as we already ran it.
clusterCall(proc, envstripper(useBasiliskEnv), envpath=envpath)
clusterCall(proc, envstripper(.instantiate_store))
Ideally, I'd have something that would basically go into useBasiliskEnv
, resolve all calls to basilisk[.utils]
utilities to base-only functions, and effectively "compile" it into a dependency-free function body that could be shipped to the child. Then I wouldn't need hard-coded knowledge of the internals needed to get useBasiliskEnv
to work.
Until then, though, the environment stripping idea is probably good enough.
Yes, stripping the basilisk
environment from all of these functions seems an even safer approach. I think that only the library
call throws an error because the others do not require a package from a defined lib.loc
and will be happy with the namespace that comes with the functions.
I have adapted the PR according to your code above.
Alright, let's see how it goes on BioC-devel.
This is related to the bioc slack discussion starting with https://community-bioc.slack.com/archives/C056CEJTH5Z/p1698998329036819 from @csoneson
By loading
reticulate
on the fallback process earlier and using a function defined in.GlobalEnv
, the fallback R process does not inherit the parent'sreticulate
namespace anymore, which otherwise leads to an error in Bioconductor 3.18 and onwards.