Open ThomasWarford opened 2 years ago
I think reading this page may have partially answered this for me.
The source for this would be most introductory books on probability theory and statistics: "Probability for Physicists" by Simon Sirca [Springer GTP] or "Applied Stochastic Processes" by Mario Lefebvre [Sprnger Universitext] would be proper references. Yo may want to have a look at the MIT course on probability for EE students
The reasoning is identical to what one gets taught in QM: $< x > = \int \psi(x)^* \hat{x} \psi(x) dx$ or, in essence, $< x > = \int \hat{x} P(x) dx$. The latter states that we sum all positions weighted according to the probability of being within an infinitesimal reqion around that position. Hence only those positions with a significant value of $P(x)$ (with high probability) contribute a non-vanishing part to the final average. In our case $P(x)$ is discrete and given for each particle by $P(q_i, p_i) = e^{-H(q_i,p_i)/k_B T} / Z$ and we're computing $< q > = \sum_i q_i P(q_i)$ Note that in the last part I have deliberately omitted the momentum part. Generally $P(q_i) = \int P(q_i,p) dp$, but since the joint distribution is discrete and the momenta are distinct for all paricles the integration yields only one non-vanishing contribution.
In my mind since $q_i$ is already generated according to our target distribution, a simple mean should give us the expectation value of $q$. Following this line of thinking the extra $e^{-H/k_bT}$ factor would only be needed if our samples were uniform.
Unrelated: I wonder whether convergence is slowed by correlation between points after 1 HMC leapfrog - perhaps performing multiple steps will reduce this.
It seems like sometimes we converge towards a value slightly below 0.75. I wonder if this could be because the mode is different to the mean. To clarify I don't know if this is the case, the mode may be the same as the mean for all I know.
Are there any resources I can read to answer this question? Are there any caveats/conditions when this isn't the case? Does this depend on symmetry/approximations?
Thanks