probcomp / crosscat

A domain-general, Bayesian method for analyzing high-dimensional data tables
http://probcomp.csail.mit.edu/crosscat/
Apache License 2.0
322 stars 42 forks source link

dha_example_multiprocessing #104

Closed jdupl123 closed 8 years ago

jdupl123 commented 8 years ago

The multiprocessing example for DHA is missing the seend arg in its calls to engine. eg X_L_list, X_D_list = engine.initialize(M_c, M_r, T, n_chains=num_chains) should be engine.initialize(M_c, M_r, T, get_next_seed(), n_chains=num_chains)

After adding these the code is still very slow. much slow than the non multiprocessing version. also looking at top shows that it is only using one core rather than the 4 I specified.

riastradh-probcomp commented 8 years ago

The seed bug is fixed in 2de0192bbbc7118a947d1307724336aab6465945. The parallelism part is not so clear. Note that only initialization (drawing Crosscat states from the prior) and analysis (transitioning a Markov chain on Crosscat states that converges to the posterior) are parallelized: once the analysis is complete, the imputation in this example is not parallelized. If you are sure it is using only a single core during initialization or analysis, you can file another issue about that.