Open dschrempf opened 5 years ago
No good way that I know. Main problem is mwc-random is not well suited for parallelization. It has huge mutable state. Something like PCG or counting generator would work much better in this case.
But since computation morally pure I think it's possible to hide IO and forking behind unsafePerformIO
or unsafeSTtoIOfrom
primitive` I however not sure how well forking will interact with unsafety
Another concern. Number of thread should be determined beforehands since setting number of workers equal to number of cores will produce different result for different setting of RTS
Thank you!
I guess I will try out PCG then. Maybe, I can also separate the random number generation and the (expansive) calculations that follow the random number generation. However, usually I need to create more random numbers within the expansive calculations. For example, I would like to run N Markov processes in parallel.
Dominik
Aleksey Khudyakov notifications@github.com writes:
No good way that I know. Main problem is mwc-random is not well suited for parallelization. It has huge mutable state. Something like PCG or counting generator would work much better in this case.
But since computation morally pure I think it's possible to hide IO and forking behind
unsafePerformIO
or unsafeSTtoIOfrom
primitive` I however not sure how well forking will interact with unsafetyAnother concern. Number of thread should be determined beforehands since setting number of workers equal to number of cores will produce different result for different setting of RTS
Main problem is mwc-random is not well suited for parallelization. It has huge mutable state.
This is true, but it doesn't mean you can't use separate mutable states per core. This means that there is a perfectly fine approach to parallelization here, in fact it parallelizes pretty well, which I can't say the same about PCG, see the benchmarks
Here is a function with a concrete example using mwc-random that can generate an array (which can be converted to a Vector
in O(1) time and memory) of random numbers in parallel with randomArraWS
Here is another a bit lower level example that doesn't depend on massiv
: withSchedulerWS
and generates a list of random numbers in parallel, but that will likely be slower that generating an array.
See my SO answer for more detail on this topic.
computation morally pure I think it's possible to hide IO and forking behind unsafePerformIO
This is only possible with unsplittable stateful generators only if you do deterministic scheduling, which is not the case in above solutions, so generation will have to be done in IO
, but considering we are talking about generation of random numbers I don't think it poses a problem for most scenarios.
Hello, many times I use
mwc-random
to perform a calculation repeatedly, and then I collect the results in form of a list (or vector). Is there a way to parallelize this computation?For example:
I can parallelize this computation by splitting the generator, for example like so
(One needs to activate scoped type variables for this to work). But then, in order to use these generators in parallel, I can only use
mapConcurrently
from theasync
library at the moment which requires theIO
monad, something I do not really want.Is there an easier possibility to perform these computations in a given number of parallel threads, each dragging along their private generator? I was, for example, thinking of
parListChunk
fromControl.Parallel.Strategies
, or similar techniques.Thank you!
PS: I also found https://stackoverflow.com/a/16250010, but I do not really understand what's going on there.