Anton-Le / PhysicsBasedBayesianInference

Implementation of ensemble-based HMC for multiple architectures
MIT License
0 stars 0 forks source link

On optimal time step #69

Open brunoroca260894 opened 1 year ago

brunoroca260894 commented 1 year ago

Hello,

My idea is to keep the entire trajectories of those accepted moves in the HMC algorithm. That is, I store the entire numerical solution of position and momentum for each particle, and based on the acceptance step for each particle, I keep the corresponding trajectories. Let us assume that we are working with a 2D space, which means that each momentum and position point has two components; that is, each particle's state is given by 4 quantities. Afterwards, I construct a matrix for position trajectories only whose dimension is (numSteps, numDim*numTrajectoriesAccepted). My idea is to find how correlated the accepted trajectories are by means of computing the covariance matrix of the previous matrix. By doing this, I believe I could get some insights into the typical set that is contained in the accepted points' trajectories. My reasoning is that if a point was accepted, then it means it possibly passed for a long time through the typical set.

From such a covariance matrix, I compute its eigenpairs, but I am getting quite confusing results since there are only two nonzero eigenvalues, and the remaining are extremely small, practically zero. This could make sense since I am working with a 2D space and the random variables are independent.

This is the plot showing the sampled points and the theoretical ones. image

This is the plot showing the entire trajectories of the accepted moves for one particle and position only: image

So I am not sure if what I am doing sound like logical.

Regards,

bruno

note: regarding the pull request, we can discuss tomorrow this since I do not want to create extra branches.

Anton-Le commented 1 year ago

To document the results of our discussion in the morning:

Utilising the particle's positions obtained during the creation of the Markov chain (propagation) and then performing a correlation/principal component analysis (singular-value decomposition) on the matrices made up of the coordinates of all of the considered particles should yield a local approximation of the potential with eigenvalues being related to the length scales in each dimension. Using the min/max length scales and the particle's speed would allow us to determine something like an "average best time step size".

Caveats:

  1. This approach requires that we store the positions at each accepted step, hence for larger models we'll likely be limited by the main memory in terms of how many positions we can store.
  2. This approach still requires us to propagate the particles for a few time steps and thus requires a good guess as to the initial step size.

Given 2 I could see this working after one HMC round since at that point we will have an ensemble with new phase-space coordinates of which we could choose a few at random and propagate them backwards in time, storing the intermediate positions for processing.

A separete but related idea would be to store the energy differences (resp. transition acceptance probabilities) for the accepted trajectories and use these to set up a transition probability matrix $P$ for each particle. Using the quotient of the two largest eigenvalues of this transition probability matrix should lead to characteristic time-scale $\tau$ for each particle. Assuming that we use $N$ particles for the sampling and a subset of $M \ll N$ particles to determine the optimal step size, which we propagate for $N{steps}$ steps this latter approach will require the diagonalization of $M$ matrices of size $N{steps} \times N_{steps}$.

Anton-Le commented 1 year ago

Following our discussion what you could try is to apply your idea to a sample of the output phase space configurations after running HMC once.

ThomasWarford commented 1 year ago

I understand there's been some problems with jax, if you upload your current code to your repo I could change it so it works with the current implementation perhaps.