Open mandel59 opened 5 years ago
Indeed. I see two approaches here:
Shell scripts have a primitive but effective way to transmit and run a program remotely: Pack it into a string and ship it off to the target machine in the hope that it has the same flavor of shell preinstalled.
Doing the same thing for TopShell wouldn't be hard, but requiring the target machine to have TopShell preinstalled is rather inconvenient.
Another thing that counts against this approach is that you'd probably want to ship closures rather than just program strings.
One thing I've considered is this:
basedir = "/etc"
dirs <- Ssh.doRemotely { user: "user", host: "example.com" } #(
files <- File.listStatus basedir,
dirs = (file <- files, [file.name | file.isDirectory]),
Task.of dirs
)
Where #e
is syntax for serializing the e
expression to something that can be transmitted, deserialized and executed on the target machine. Combined with a typing rule like if e : t
then #e : Program t
, some kind of mechanism to check that e
can be serialized at all, and last but not least a way to actually run the resulting program on the target machine in the absence of a preinstalled TopShell.
Of course, this is still just a vague idea and would require a non-trivial amount of work to implement.
What's your thoughts on this? Do you have a different solution in mind?
If expressions themselves are serializable, like C#'s expression trees of lambda expressions, the special quasi-quoting syntax #e
might not be needed. Moreover, it might let Program
be a monad: giving ShellScriptProgram.of
and ShellScriptProgram.flatMap
, implementations of shell script builders, and then the binding syntax x <- e1, e2
would be available.
Other code builder implementations like JavaScriptProgram
or SqlProgram
are also able to introcude.
Pure computations always run on the local worker with the current
Ssh.do
implementation. How to compute whole data flow on remote?For instance, let's think about the following code:
At first glance, it seems that files are listed up and directries of them are selected on the remote server, and then the directories are returned to local, like the following shell script line:
Actually it doesn't work so. Only actions are sent to and run on remote. Pure computations and the pipeline run on local like: