Anton-Le / PhysicsBasedBayesianInference

Implementation of ensemble-based HMC for multiple architectures
MIT License
0 stars 0 forks source link

Ensemble in HMC #98

Open Anton-Le opened 2 years ago

Anton-Le commented 2 years ago

After implementing a case to run HMC with a probabilstic model I think HMC requires a partial rewrite.

HMC here is a method to evolve an ensemble of particles.

  1. As such it does not have to initialize said ensemble.
  2. An ensemble with updated phase-space coordinates and weights should be returned.

I think we can retain an initialization in HMC iff it is made optional in the same way as was the case with the potential for the integrator tests: if an initialization function is provided use it to initialize the ensemble, otherwise assume the ensemble that has been passed in is initialized.

As for returning the ensemble: The current implementation will return the positions and momenta of the samples generated along the way. This is great if the goal is to visualise the path of the particles - less so if the goal is to embed the procedure in a larger method that is going to modify the ensemble of particles prior to re-running HMC. Here a simple way out would be to update the weights and return the ensemble as the third return value from HMC.

ThomasWarford commented 2 years ago

Ok, that sounds good - a few clarifying questions/statements:

  1. What are we planning to do with HMC going forward?

    • Why do we not care about intermediate HMC steps?
    • Is it because we are ultimately trying to find global minimums, not generate samples?
      • If we care about both it may make sense to have two methods, so memory is not wasted on storing history.
  2. Should temperature be an attribute of Ensemble? I think this would be an elegant solution if we plan to decrease temperature between HMC runs if the ensemble instances are being modified. It also makes ensemble.setMomentum easier to work with.

  3. I think it is best to have the ensemble being an argument to a HMC.xxxx method, rather than an attribute of the HMC class. I say this because this removes confusion regarding HMC.ensemble being updated or not.

  4. Regarding the ensemble.setMomentum method, in a parallel implementation we need to be able to set the momentum of individual particles, so it's probably best to have HMC.setMomentum instead

Thanks, and apologies for the wall of text, Tom

brunoroca260894 commented 2 years ago

Hi,

I think we should not initialize ensemble in the HMC class. This should be a argument containing all relevant information of the particles. Tom, I do not get what you meant by intermediate HMC steps. It is my understanding that so far we have not kept the full trajectories generated at each iteration but just updated the parameters when a given point is accepted. I agree with Tom on points 2, 3, and 4. I believe that we should have a separated object for the ensemble and this is passed to the HMC implementation such that we modify this argument instead of copying this to a different object when it's passed to HMC. By doing this, we may have a unique object to modify during the execution of HMC. Besides, I recall that in our first implementations, the ensemble class had a method to retrieve information of all particles. Then we could use a similar approach if we, for example, plan to access to a specific set of particles in the ensemble. By doing this, we could modify the temperature from one HMC step to next ones. I think the HMC class should just update an ensemble object, initialized separately, using methods defined within such a class.

How shall we move forward?

Regards,

bruno

Anton-Le commented 2 years ago

General response

Regarding the plans with HMC:

HMC is going to be used as a kernel in the final method, that is: an integral part of the code. HMC itself may be augmented by a different propagation scheme. I have already mentioned the u-turn condition, which this scheme would implement.

Regarding the additional samples the situation is the following: Primarily I'm interested in generating one set of samples using HMC, i.e., given an ensemble of particles with arbitrary positions and momenta propagate it for N time steps and select either new or old phase space coordinates based on the MH test. The positions of the new ensemble (which are a mix of new and old positions) are the new samples of the distribution.

When embedded in a larger method this process is iterated until (hopefully) the particles have congregated to the global minima of the potential. Each iteration (that is each run of HMC) will thereby yield one set of samples - the positions after HMC is run. The samples themselves generally do not have to be retained since they are used to obtain mean values and check convergence criteria (i.e., spread of the sample distribution) only.

Currently (dev) providing a numSamples > 1 run the integration the given number of times, generating a new set of values each time. This is more than required for simple HMC but not enough for the larger method mentioned above.

I will not insist on the removal of history storage as it may be helpful in generating illustrations for your presentations/reports as well as because the memory footprint can be kept low by setting numSamples = 1 to achieve almost the desired result. Still, removing history storage should clean the code up and likely prevent unnecessary memory movement in the GPU version.

On temperature

W.r.t. the temperature I agree with your reasoning. Retaining the temperature of an ensemble would make it an ensemble-specific temperature which could then be used in a thermalisation process among multiple ensembles (sub-ensembles).

Ensemble as an argument to HMC

In principle I agree on the ensemble being an argument to the method that is HMC. Keeping a copy as a component of the class could prevent unintended data movement (device-host) if the entire HMC can be moved to the GPU.

Setting the momentum

Momentum/position are properties of the ensemble and should be handled by the ensemble. It may be a good idea to make an 'initializer function' the only positional argument to the setters. This initializer function would take the array of positions/momenta and the number of particles as positional arguments and have other paramters baked in. Then we could force sequential/parallel initialization by providing different init functions.

Remarks not directly related to the implementation

On temperature

The suggestion of an ensemble-local temperature is physically tenuous, especially for ensembles containing only a few particles. Partitioning into sub-ensembles and utilising a "thermalisation" among them is (as far as I can remeber) the gist of a theoretical statistical physics paper from ~4 years ago. I'll edit this post to add the link once I find it again.

FYI

There's no problem with having more extensive texts, since the PRs and issues are in part the documentation of this project development. What you should take heed of are the formatting options. As you can see you can create a header-subheader etc. structure as well as format code properly by enclosingit in backticks (or triple that for a block). You can also, since a year or so, use basic LaTeX to format equations. The latter just becomes problematic if there are too many tokens conflicting with Markdown.

ThomasWarford commented 2 years ago

Sounds good, I'm making good progress on this so far. I'm making the changes in bugfix-ensembleInHMC which comes from feature-jax.

ThomasWarford commented 2 years ago

I'm going to set weights to be $\exp(-\frac{H}{k_b T})$, please let me know if this is incorrect.

ThomasWarford commented 2 years ago

^ Changes made are in this pull request / bugfix-ensembleInHMC branch I am wondering whether I should wait for feature-jax to be pulled into dev before merging this.

Anton-Le commented 2 years ago

Alright. The changes are now in the dev branch, so we may consider this issue as resolved.

After trying the code with the linear acceleration model I have noticed one thing: With a bad choice of the centre of the initial distribution of positions the gibbs weights become =0 and hence an evaluation of the ensemble-weighted mean turns up nan's. We'll see how this changes with #115 and then I may to have to change the way the weights are handled.

Anton-Le commented 2 years ago

Correction:

The mapping of the parameters via constrain_fn takes care of the trivial guess of initial positions that we perform (at least for symmetric distributions). The nan problem is still due to vanishing weights, but that is due to an exceptionally high energy of the system, compared to the chosen temperature $1/k_B$. Upping the temperature to $10^{5} / k_B$ resolves the problem of nan's for the weighted mean, though it tends to produce nonsensical parameters after HMC.

ThomasWarford commented 2 years ago

Ok, we'll see how this unfolds.