Closed ImNotaGit closed 2 years ago
Can you explain a bit what the underlying problem is you are trying to solve?
If you want to have a remote-controlled worker on an HPC system, maybe something like rmote
is a better fit than clustermq
. I did not intend w$send_call()
to be suitable for interactive work, but rather developing packages that want to interact with workers (because the logic is a bit complicated - and the objects generated in a call will not persist in the worker environment).
If you want to keep your workers running while sending multiple batches of work, you can do the following:
w = workers(...)
Q(function(x) x*2, x=1:5, workers=w)
Q(function(y) y+3, y=1:5, workers=w)
w$cleanup()
Thanks for the prompt reply. Yeah I also realized that a remotely controlled interactive R session is not what clustermq
is mainly intended for -- I was just curious whether that's easily doable within clustermq
, since if so it could be quite convenient without me having to deal with IP addresses/ports etc. explicitly. I have tried both the rmote
and the remoter
packages before, and while those are very useful they did not perfectly satisfy my need. But anyway if no workaround within clustermq
exists I think this issue can be closed.
Can you explain a bit what the underlying problem is you are trying to solve?
Well, briefly, my institution has RStudio Server hosting on a remote machine with shared and very limited memory and CPUs. I wanted to find a way to use an R session on another HPC node (so that I can request for the exact resources I need), but still retain an interactive workflow within RStudio (like being able to send codes and receive small result objects, e.g. ggplot2 objects for visualization and R notebook knitting).
Edit: I have since looked at the remoter
package again and luckily with some tweaking I've got it to work for my use case. Combining it with clustermq
completely solved my issue.
I have got clustermq set up with an SGE template and paralleled computation with the
Q
function works well. I see multiple jobs being successfully submitted, and then the jobs exited with results returned properly after the computation is finished. I do not have a clear understanding of how it works internally, but for certain use cases I'm trying to get things more interactive, i.e. to get a R session constantly running on a persistent worker until I manually kill it, and to interactively send objects to this remote R session, perform computation, and receive objects from it. For this, I tried the following:Then I explored using
worker$send_call
to run codes interactively, which works fine within each call, e.g. a toy example:However, this is something I missed -- I imagined that there is a persistent R session running on the worker, but it seems that this is not the case, as when I subsequently tried to retrieve either
a
orb
orn
, these are not found, andls()
returnedcharacter(0)
, e.g.:Is this because each call actually starts a new independent R session which exits when the call completes? Is there a workaround to achieve my aim of getting a fully interactive and persistent R session on a remote worker?