Closed kendonB closed 5 years ago
That's interesting, I thought parLapply
and family actually use the network in the snow
implementation.
On a first glance, it does not look straightforward (and not sure if possible at all) to add clustermq
-like support for parLapply
clusters. I will investigate.
For your use case, is it possible to use parallel foreach
instead of parLapply
? This is already supported.
Will try foreach. Thanks heaps!
@mschubert do you intend to use parallel foreach
to solve #64, or are these issues separate?
@wlandau They're separate. Here it's using cmq as a backend, #64 would not work with the backend because it uses parallel::mcparallel
(which is not available on Windows)
I find parLapply can be frustrating as it seems to move data via disk in an excruciatingly slow manner.
That's interesting, I thought parLapply and family actually use the network in the snow implementation.
Correct: parLapply(cl, ...)
iterates over parallel cluster
nodes ("workers") where the cluster is typically set up using cl <- parallel::makeCluster(..., type)
. The default is type = "PSOCK"
but there is also type = "FORK"
and type = "NWS"
. At least for PSOCK and FORK, all objects are communicated back-and-forth between main and worker(s) go via socket connections; not the file system. I never used NWS, so that one might involve a file system, but my guess is that @kendonB is not using that (or?).
I guess I was wrong - I certainly observe that objects fly around much faster via zeromq than with parLapply so perhaps it's just that regular sockets are slow - I see the memory change at about 30-40MB/s so it takes several minutes to get a 2GB x 4 cores job going.
I find parLapply can be frustrating as it seems to move data via disk in an excruciatingly slow manner. Is it straight forward to modify parLapply to use zeromq?