Open gabora opened 7 years ago
This is not (yet) possible. Sending commands to a remote system is no problem, runOSCommand
already supports that. The IO is a bigger problem. Every file must be copied to remote, or you must rely on something like sshfs. But even with sshfs, you would need to translate local paths to remote paths and vice versa. This is usually pretty error prone. I'll keep this open and probably implement it in a future version.
Short comment regarding sshfs: the few times I used this to ensure a shared fs when there was none I was surprised how well this worked. I had no problems, just saying this if somebody needa a workaround.
@gabora: is the main reason you want to do this because of the editor? R studio?
Because years ago I was in the same situation before I began using vim. For me the main hurdle was to avoid manually copying files from the local machine to the server.
So when you use sshfs here, in a different way, to mount your server project code dir on your local machine, not many problems remain
Does this help? I like using vim on the server more now. But this approach was really totally OK for me some time ago.
Further side note: Getting reliable exit codes and output for remote commands seems to be possible with the subprocess package. Unfortunately, C++11 is a requirement.
Note that I implemented something like that on a ZeroMQ-backend (clustermq) back in the day when I was frustrated with BatchJobs
not being able to handle my number of jobs on a shared file system (due to SQLite locking).
The main difference to batchtools
is that it doesn't store anything on network-mounted storage and does load balancing, but implementation is a lot more naive. Nevertheless, it works well for me (and a couple of other people, too).
It also supports sending remote jobs via SSH (so first SSH, then the job submission system). Downside is that it relies on the SSH forwarding not getting disconnected while the jobs run, which is good enough for my purposes but may not be for everyone.
Thanks a lot for the reply @mllg @berndbischl and @mschubert .
I used @mschubert 's clustermq, which works really well, but we had problem with long runs.
@mllg thanks for considering for possible implementation! @berndbischl : I was looking for this options because
I will try sshfs
, thanks a lot for the suggestion!
Sorry if this is a naive questions, but I'm still new to batchtools
: how would I set up clusterFunctionSGE
and clusterFunctionSSH
for batching qsub
jobs on a remote machine? Here, the same file system is shared across machines, so file copying shouldn't be an issue.
Background:
I'm running rstudio-server
on a linux VM that running on our department server. We have a large SGE compute cluster available, but the VM running rstudio-server
is not a submit host. So, in order to submit jobs to the cluster via qsub
, we have to ssh
to another machine (or a submit host VM) and run qsub
from that machine.
This was already implemented for Slurm, and now there is a prototype for SGE. It is kind of buggy though, getting the quoting right is not straight forward.
NB: LSF's bsub needs the template provided via STDIN instead of a file. I currently see no way to accomplish this in a reliable way with SSH.
I'm not sure exactly what you mean by "getting the quoting right". Are there any guidelines/docs to help users avoid problems?
I'm not sure exactly what you mean by "getting the quoting right".
This was just a comment on the implementation ...
Are there any guidelines/docs to help users avoid problems?
Not yet. I still have to write it down in full detail, but here are the required steps in a TLDR style:
sshfs user@remote_cluster:/home/user/experiments /home/user/experiments
(you need to create them first). You need to have the same filesystem layout on both client and remote, relative to your home directory. ~
will not be expanded so that you can have different login names and symlinks will not be resolved on the client.batchtools
with the configuration for the cluster site, e.g. with makeClusterFunctionsSGE("template-file-on-client", nodename = "remote_cluster")
makeRegistry(file.dir = "~/experiments/reg")
I'm quite close to making this work, but it seems like there is a path being expanded in the slurm SSH submission still. I get the following error:
submitJobs(ids,
+ resources = list(walltime="00:30:00",
+ memory = "4gb",
+ ncpus=1))
Submitting 20 jobs in 3 chunks using cluster functions 'Slurm' ...
Error: Fatal error occurred: 101. Command 'sbatch' produced exit code 1. Output: 'sbatch: error: Unable to open file /home/rob/princescratch/apml/test/jobs/jobd7fe014f580d5d26c586d429b4ba7a10.job'
where /home/rob/
is the home directory on the local machine where I'm submitting from.
The setup is as follows:
reg <- makeExperimentRegistry(file.dir = "~/princescratch/apml/test",
packages=c("data.table","foreach",
"doMC","findSplit"),
conf.file = NA)
cf <- makeClusterFunctionsSlurm(template = "~/princescratch/apml/slurm-prince.tmpl",
nodename = "rjr@mycluster",
array.jobs = TRUE)
reg$cluster.functions <- cf
Any suggestions on where to look? The template seems to be generating a reasonable job file with a relative path for the Rscript call. Thanks.
Hi, I am curious if I could use
batchtools
on my local machine and submit jobs through SSH (usingBSUB
with LSF queueing system) on a remote cluster?This would be a combination of
clusterFunctionLSF
andclusterFunctionSSH
, but I haven't found such thing implemented yet, right?I really would like to work on my local machine (in Rstudio) and do the computation on the cluster.
ClusterFunctionSSH
is not a solution, because I am not allowed to do computation/memory heavy tasks on the front end (as far as I understoodclusterFunctionsSSH
together with theworker
does not use the queue system, but I might missed something here. ).Looking forward to your suggestion, Thanks and Kind Regards, Attila