paciorek / future-kubernetes

Instructions for setting up and using a Kubernetes cluster for running R in parallel using the future package.
39 stars 10 forks source link

paciorek/future: Argument 'rshpostopts' #1

Closed HenrikBengtsson closed 3 years ago

HenrikBengtsson commented 4 years ago

I'm looking at your fork https://github.com/paciorek/future, but it's not clear to me which branch I'm meant to look at. For instance, I cannot find get_kube_workers(). Anyway, looking at https://github.com/HenrikBengtsson/future/compare/develop...paciorek:develop, I see that you added a new argument rshpostopts to future::makeNodePSOCK() such that you could append it as:

rsh_call <- paste(paste(shQuote(rshcmd), collapse = " "), rshopts, worker, rshpostopts)

For instance, with an SSH connection that would allow you do launch the remote worker as:

ssh remote.server.org <additional options> "'/usr/lib/R/bin/Rscript' ..."

However, you might be able to achieve the same via the rscript argument. For example, from ?makeClusterPSOCK I use the following to launch Rscript inside a Singularity container:

cl <- makeClusterPSOCK(
  rep("localhost", times = 2L),
  ## Launch Rscript inside Docker container
  rscript = c(
    "singularity", "exec", "docker://rocker/r-parallel",
    "Rscript"
  ),
  dryrun = TRUE
)

Could you do the same for your needs? I guess I need to see the output of your:

makeClusterPSOCK(..., manual = TRUE)

to understand what the underlying system call looks like.

paciorek commented 4 years ago

@HenrikBengtsson Thanks for taking a look. I was starting to work yesterday on some simplifications that would start the workers via Kubernetes and not within makeNodePSOCK. I think that will simplify what needs to happen on the future side of things (and remove the need for rshpostopts, so I'll experiment some more and then follow up with you.

(Also get_kube_workers is set in the Docker container, but that may not be the best place for it.)

paciorek commented 4 years ago

Ok, so I've simplified things such that plan() (i.e., makeNodePSOCK()) does not need to start the R workers. Kubernetes will do that.

I believe that with this change the only things that would be needed to have a kubernetes strategy for future::plan() would be:

1) makeNodePSOCK should not try to start the R workers. All it needs to do is start the socket connection on the master side. In paciorek/future:develop I have a hack that avoids invoking system on local_cmd. 2) As a side note: rshcmd should not be required/used (I've currently set things up so rshcmd is being set to something innocuous, namely echo), as ssh is not available in the Kubernetes pods.

paciorek commented 4 years ago

@HenrikBengtsson I am getting back to this and was thinking about making a pull request such that future could play nicely with a Kubernetes cluster, cleaning up my current hacks. Please let me know your thoughts about the potential approach below, or if you have a different approach that you prefer.

I think the simplest thing to do is to modify makeNodePSOCK to allow it to avoid starting the R workers (as well as the various related steps of checking for a valid rshcmd). Whether to start the workers would be controlled by a new argument that the user would set when calling plan(cluster), perhaps startWorkers = FALSE or useKubernetes = TRUE. That new argument would need to get passed through to makeNodePSOCK.

A more involved alternative would be to add a kubernetes strategy, but conceptually the Kubernetes functionality is essentially the same as the cluster strategy, apart from how the workers get started. And modifying makeNodePSOCK as suggested above would be a much more limited change than creating a whole new strategy.

paciorek commented 4 years ago

@HenrikBengtsson and I discussed this in person in mid-July. He suggested a series of tests to see what other syntax would work.

This is to note that all of these tests succeed, as you believed they would.

====== text from email ======================================= this is what I think you said you did right now using your https://github.com/paciorek/future fork:

library(future)
Sys.setenv(USE_KUBERNETES = "TRUE")
Sys.setenv(R_FUTURE_MAKENODEPSOCK_RSHCMD="echo")
plan(cluster, workers=rep("", 4), revtunnel=FALSE)

As I mentioned in the call, I suspect, it will also work if you do:

library(future)
Sys.setenv(USE_KUBERNETES = "TRUE")
Sys.setenv(R_FUTURE_MAKENODEPSOCK_RSHCMD="echo")
plan(cluster, workers=rep("localhost", 4), revtunnel=FALSE)

If that works, then retry with:

library(future)
Sys.setenv(USE_KUBERNETES = "TRUE")
plan(cluster, workers=rep("localhost", 4))

which I think should work as well. And if this works, then:

library(future)
Sys.setenv(USE_KUBERNETES = "TRUE")
plan(cluster, workers=4)

should also do it. Does it?

To get rid of the need for the 'USE_KUBERNETES' workaround, I think the following will work as-is when using my version of the 'future' package by using:

library(future)
cl <- future::makeClusterPSOCK(4, rscript="false")
plan(cluster, workers=cl)

Does it? If it does, then I have something to work with for simplifying this in future::makeClusterPSOCK() and later plan(cluster, ...).

Also, which I never mentioned this in the call, if the above works, another alternative that also should work is:

library(future)
cl <- future::makeClusterPSOCK(4, manual=TRUE)
plan(cluster, workers=cl)

If you look at the code, using manual=TRUE will basically achieve what your 'USE_KUBERNETES' workaround currently does. Of course, manual=TRUE will output some irrelevant messages to stdout. But knowing that this one also works helps me further.

HenrikBengtsson commented 4 years ago

I've just push updates to the develop branch of future so you can do:

library(future)
cl <- makeClusterPSOCK(4, manual=TRUE, quiet = TRUE)
plan(cluster, workers=cl)

Please give it a try.

paciorek commented 4 years ago

It works.

HenrikBengtsson commented 4 years ago

Great. FYI, future 1.19.1 with this feature is rolling out on CRAN since early this morning.

I guess, the next step for me is figure out how to be extend plan() to support these arguments, e.g.

plan(cluster, workers=4, manual=TRUE, quiet = TRUE)

Given that the makeClusterPSOCK() now works, how "urgent" is this for you? There are other things I want to be able to bring into plan() as well, e.g. supporting "tee":ing output for certain types of backends. To support these needs, I need to figure out a standard internal API that can be used in all cases. Should be hard, but it takes some deep-focus thinking.

paciorek commented 4 years ago

Not urgent. Simply being able to use the CRAN future rather than my own fork in the recipe I provide is a very nice simplification. So I'm going to update my materials/documentation.

I have some thoughts about publicizing this functionality that I'll email you about.

paciorek commented 3 years ago

Simply using plan without makeClusterPSOCK now works, and the user is not required to specify the number of workers. So this issue is fully addressed.