Closed aloctavodia closed 5 years ago
@aloctavodia Can you provide a quick explanation of SMC?
In the case of temperature replica exchange (a la molecular dynamics simulations), there are multiple replicas of the same system being simulated at various temperatures and where neighboring temperatures would occasionally swap as long as some metropolis criteria is met. This method was popularized by Sugita and Okamoto.
Note that one is not only confined to exchanges in temperature space and, in fact, one could also exchange in Hamiltonian space.
While exchanges between neighboring replicas is one aspect of the sampling, the other key is to be able to use the data coming from ALL temperatures (or Hamiltonians) and not only derive your probability distribution strictly from the lowest temperature. In order to accomplish this, one simply needs to use a method called Weighted Histogram Analysis Method (WHAM). And, at the limit of zero-bin widths, WHAM is also often referred to by it's equivalent name of Multistate Bennet's Acceptance Ratio (MBAR).
A well known Python implementation of MBAR by Shirts and Chodera can be found here.
And an open source C++ implementation of (zero-bin width) WHAM can be found here. Full disclosure, I wrote this code but no longer support it as it's been several years since I've last touched it.
Hi @seanlaw, nice to see you here! and thanks for you input, I will check those links and see if they are applicable to SMC. Now that you mention I remember using WHAM years ago in the protein-folding context, but it was a black-box for me :)
In SMC you start at a high temperature and then you move sequentially to lower temperatures. A SMC is somehow similar to a genetic algorithm.
When I said posterior probability I am referring to a tempered posterior.
$p(\theta | y, \beta_m) = p(y | \theta)^{\beta_m} p (\theta)$
where $\beta_m$ is the temperature-like parameter and goes from 0 (just the prior) to 1 (full posterior).
I hope this helps
The visual helped. I think I have a better conceptual idea of how SMC works. In relation to our offline conversation, can you tell me what you mean by "how to recycle high temperature samples"?
When I say "use all samples from all temperatures" using replica exchange, I am referring to how to reconstruct the (posterior?) distribution after all sampling is complete and that would be via WHAM. I'm not sure I am using the same terminology so you'll have to forgive my lack of Bayesian thinking.
I don't know if "recycling" is the better term. But I am also talking about using all samples from all temperatures.
@aloctavodia thanks a lot for starting this! I am glad it works so well for you! Where did you find this nice figure? Did you prepare it yourself? I think we shouldnt use all samples from all temperatures but rather, selective samples from all temperatures?
@junpenglao Actually the current implementation is a mixture of the one you mentioned and this: http://ascelibrary.org/doi/abs/10.1061/%28ASCE%290733-9399%282007%29133%3A7%28816%29
But the original paper where it has been developed is this: http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9868.2006.00553.x/abstract
The other ones more or less introduce it to the engineering and geoscience communities, respectively, the math is basically the same. Just for citation purposes we should give credit to the original developers...
@hvasbath I got the figure from Google Images. Is my intention to prepare one myself mainly for an internal group meeting talk (I could send it to you when ready), I am thinking now that we should probably have a notebook describing the SMC sampler :)
We can use all samples, if they are weighted properly, to estimate any desired quantity. Still the details are elusive to me, I hope things become clear when I read about WHAM and take a second look at https://arxiv.org/abs/1504.05753
I started describing the SMC sampler in a condensed version for a paper I am writing. We could use this as a base and change it otherwise we will get into troubles with copyright later on, if eventually my article gets published ...
Good point @aloctavodia ! Unfortunately, I wont have time to look into it until the mentioned article is in good shape, but maybe October I could have a look as well...
Another paper which might be worth noting and implementing: https://arxiv.org/abs/1103.3965
Is about efficient sampling with as few chains as possible for high-dimensional problems... Maybe the other points work toward the same direction ... For now the number of chains has to be very high compared to number of variables in order to get a valid posterior...
Fixed in #2563
I think this is not off the table and #2563 is rather unrelated or am I missing something?
I agree with @hvasbath this is not off table (at least not yet). But I understand @junpenglao confusion. Let me try to clarify this, the computation of the marginal likelihood in #2563 use important weights from all stages. Those same weights could be used to get (weighted) samples from all stages/temperatures, but at the moment we are not doing that, instead we are just returning all the samples from the last stage. It is not entirely clear to me which strategy is the best, intuitively I see pros/cons for both of them. Anyway, I could start working on this issue in the next few days, if it is ok to you guys.
I see, thanks for the explanation!
I have been using the new Sequential Monte Carlo sampler for some very hard to sample multidimensional distributions with great success, thanks @hvasbath for that!
Currently the SMC sampler only uses the samples from the lowest temperature (latest iteration). I know there are way to recycle/use samples from the highest temperatures. One reference I have found is this one https://arxiv.org/abs/1504.05753. Before trying to implement anything I wonder if someone has other references to share or thoughts about this point.