choderalab / chiron

Differentiable Markov Chain Monte Carlo
https://github.com/choderalab/chiron/wiki
MIT License
14 stars 1 forks source link

MC move criteria updates. Should we make this a separate class that gets passed to the move? #26

Open chrisiacovella opened 6 months ago

chrisiacovella commented 6 months ago

Often times we will want to update proposed move criteria during a run (e.g., increase/decrease the maximum displacement of a particle) to achieve a target acceptance probability. In PR #21 this is implemented as a function within the specific MC move, since this criteria will be dependent on the type of move (e.g., max volume scaling, max displacement, etc.).

In our meeting 1-Feb-24, @andrrizzi postulated if this should be a function/class we pass to the move (similar to how we handle PBC with the Space class). In the current scheme, if we want to have different criteria, we would need to create a child class the move and overwrite this function; this could lead to a bunch of additional classes if we end up implementing different criteria. If the criteria were in a separate class, it would mean no changes to the MC move, just a different criteria passed.

Right now, the criteria updates are pretty straight forward and follow the typical schemes. I can't really think of many cases where we would want to deviate from the standard approaches (which doesn't mean there aren't any, I'm just not really coming up with different use cases at the moment). I will note, that right now, the parameters inside the updates are static.

it would be good to be able to adjust some of the internal parameters of the criteria update. E.g., in updating the displacement sigma, I have it adjust if acceptances is below 0.4 and above 0.6; being able to modify these tolerances would probably be beneficial. Passing this as a class, where those parameters are set in its constructor, might be a bit cleaner than adding in a lot of extra variables to the constructor of the MC Move. I think this would fit in better with modular approach we have been taking in the rest of the code (this would be a lot like the Reporter class that is passed to the moves). The drawback is that it does increase the number of classes to import and instantiate to get a system running, but I think flexibility and clarity from using a class would be worthwhile.

This would be a pretty straightforward change, a mockup of what this would look like is below.


class BarostatCriteriaUpdate:
    def __init__(self, lower_bound=0.25, upper_bound=0.75, scaling_factor=1.05, maximum_volume_scaling=0.3):
        self.ower_bound=0.2
       self.upper_bound=0.75
       self.scaling_factor=1.05, 
       self.maximum_volume_scaling=0.3

    def update(self, volume_scaling, acceptance_probability):
        if acceptance_ratio < self.lower_bound:
            return volume_max_scale /= 1.1
        elif acceptance_ratio > self.upper_bound:
            return volume_max_scale = min(volume_max_scale * 1.1, maximum_volume_scaling)

criteria = BarostatCriteriaUpdate(lower_bound=0.25, upper_bound=0.75, scaling_factor=1.05, maximum_volume_scaling=0.3)

mc_barostat_move = MonteCarloBarostatMove(
    volume_max_scale=0.1,
    nr_of_moves=10,
    reporter=reporter_barostat,
    report_frequency=1,
    criteria_updater= criteria
    criteria_update_frequency=50,
)

internally in the barostat we'd have _update_criteria() that does something like:

class MonteCarloBarostatMove: 
   .
   .
   .
    def _update_criteria(self, criteria):

          self.volume_max_scale = criteria.update(self.volume_max_scale, self.n_accepted/self.n_proposed )
jchodera commented 6 months ago

We need to be careful about terminology here. What you are suggesting is that we provide a separate class that specifies how to auto-tune the proposal probability hyper parameters to improve efficiency, with the idea that this can somehow be generalized across move types to specify different general autotuning behaviors.

I think this is a good idea, but

  1. It isn't the most important near term action
  2. It requires some research into general algorithms for auto tuning proposal hyperparameters in an asymptotically consistent manner
  3. It then requires we clearly specify an API for how the moves would interact with the autotuner to register accept/reject statistics (and likely logPaccept, which contains more information that binary accept/reject), specify the domain of their hyperparameters, and query for updated parameters.

Let's park this issue for now until we address (2) and (3).

chrisiacovella commented 6 months ago

I actually wasn't thinking about generalizing across move types, but now that you mention it, the move parameter updating code is effectively identical in both the barostat and displacement move. A more advanced hyper parameter tuning approach would certainly take more thought.

I'm totally fine shelving this for now, but at the same time, making this a bit more general (if only from the context of simple parameter tuning) would be trivial while the PR is still open.

andrrizzi commented 6 months ago

Definitely not the most urgent, but what I meant during our last meeting is that it might be worthwhile to think about possible ways to treat online parameter modifications consistently in MCMoves and states. By this I mean to use a consistent design pattern. For example, if thermodynamic parameters during a simulation are updated by injecting a "updating policy" class into it, then we could consider a similar design for MCMoves. But if in the design states are modified by a higher-level object, then we might want to do the same with MCMoves.

A use case for this could be to optimize a measure of sampling efficiency by searching the joint space of both thermodynamic parameters (e.g., alchemical pathway, lambda schedule) and MCMove parameters (e.g., max displacement, etc.) whose optimal values might be dependent on the thermodynamic state parameters. Think Ray's hyperparameter tuning but for sampling.

jchodera commented 6 months ago

There is some practical advice that may be of use in how to implement an automated step and general size tuning scheme.

Automatically tuning the proposal size in Metropolis-Hastings Monte Carlo algorithms can lead to errors if not done appropriately.

There are some mathematics-heavy approaches to describing how to build correct asymptotically consistent versions of adaptive Monte Carlo algorithms using ideas from sequential Monte Carlo (SMC), but I wonder if there are some more practical guides we can use. This chapter seems to have more practical guidance.