Open axch opened 8 years ago
Even jobs that run many particles as completely independent chains should benefit, by being able to parallelize the initial replication of the initial state. This actually contributed measurably to the latency of my large challenge problem runs, e.g. the Gibbs sampling solution to the homophily problem.
The current architecture of resampling when running multiprocess is
Our current regime is that particles are in general quite large (being full execution traces of the model program) and quite expensive to transmit and copy.
In this regime, this architecture is undesirable, because the master is a bottleneck for doing all the copies (perforce in series), and for the communication necessary to collect the particles.
This architecture is also unnecessary, because the weights are sufficient for the semantic content of resampling. We should probably move to an architecture like this:
Effects:
Possible intermediate state: proceed as above but leave the master to actually perform cross-worker communication, thereby retaining the current structure that the master just holds pipes to all the workers and they need not talk to each other. The intermediate state is probably still an improvement over the status quo.