Make subrngs reusable in parallel mcmcsample

mohamed82008 commented 2 years ago

In this PR, I attempt to expose the sub random number generators and make them reusable between calls to mcmcsample. This is helpful to maintain the state of the sampling. Unfortunately, it seems that I have broken the reproducibility tests. I am not exactly sure how my change here broke things. Could use extra pairs of eyes.

mohamed82008 commented 2 years ago

My first question, as always, is: Where is this functionality (supposed to be) used and is it common enough to justify a more complicated implementation and API?

The use case I have in mind is to do chain based sampling pausing and resuming in a reproducible way that's comparable to running the full chain, in serial, multi-threading and multi-processing.

devmotion commented 2 years ago

Hmm I'm sorry, I don't understand how you want to make it comparable to pausing and resuming a single chain. Even if you set the RNGs in the multichain methods the samples won't be the same as in the single chain. It also seems you forgot to add this feature in the serial sampling.

Or is your motivation that serial, multithreaded and distributed sampling yield different chains? I'm not sure if this is actually a problem as you can get different chains even for the same method, same RNG, same seed and same Julia version on different architectures (and even more so with different Julia versions). The discrepancy between multithreaded and distributed arises only from the fact that we can use a lower number of RNGs in multithreading and hence reduce the copy overhead. In serial sampling we don't have to sample seeds and hence we deviate from the other methods.

mohamed82008 commented 2 years ago

Ya I figured a way to do what I want without this PR. Will close this one. Thanks for your feedback.

TuringLang / AbstractMCMC.jl

Make subrngs reusable in parallel mcmcsample #87