Open andrrizzi opened 6 years ago
I think we want to allow multiple restraints for the same alchemical region, rather than requiring these to be in separate regions. Could we simply allow each restraint for each region to have a separate label or name?
We could tackle this at the same time as reworking the restraints to use and store CustomCVForce
s.
I agree. I meant that the implementative solutions would be similar, not that we should assign one restraint for each region. In both cases, we'll have separate variables controlling different parts of the Alchemica/RestraintState
. Sorry if it was unclear.
Here are the required code changes I see for the multi-restraints:
restraints.py
classes to support different names for the restraint variable, defaulting to lambda_restraints
AlchemicalPhase.create(restraint=...)
keyword to support None
, a ReceptorLigandRestraint
or a list of ReceptorLigandRestraint
(we can expand this to a LigandRestraint
-ish thing later)I want to propose the following YAML layout for the multi-restraints
class:
restraint
which supports arbitrary names (like systems
have names). Keys are just like the current experiment:restraint:....
block is now. Add new option lambda_name: X
which takes a string to be cast to lambda_X
, allows each restraint to be tied to either unique or existing values (including sterics and electrostatics)experiment
block to now have a restraints: R
option where R
is either a single or list of restraint(s) defined in the restraints
block.
protocols
block, but people can now just define lambda_X
as many times as required where X
are the names in the lambda_name
keys for the used restraints
lambda_X
in the protocols and throw a warning if there is a defined lambda_X
which does not tie to any force which references it. But OpenMM may actually error if there is no valid target for setParameter
.lambda_X
in the protocol for every lambda_name
since there are use cases where we want the restraint to be on the whole time.Known limitations with this design:
restraints.py
module per restraint. Thoughts on this, @jchodera, @andrrizzi?
This sounds great.
Regarding breaking backward-compatibility: I'm fine with bumping the major version number to 1 at this point since this is the home stretch. We can then start incrementing the major version number every time we break the API, though it might be good to get a few API breaking changes in all at once.
This might also be a good time to rework the restraints to all use CustomCVForce, which would allows us to generalize the restraint classes to all (1) define or reuse CVs (some of which may be shared), (2) store them all each iteration, and (3) compute the restraint via an algebraic energy expression for CustomCVForce.
To fix "auto" protocols, we could include a protocol block that defines waypoints along which auto is supposed to interpolate. That way, users can define a very crude path and have auto come up with the spacing for them.
Have we thought about how multiple alchemical regions might play nicely with this? Maybe there's a chance to support multiple regions in a Topography object for now (if that feature doesn't already exist).
Sounds great, and I'm ok having a list of restraints instead of using the CompoundRestraint
solution. A couple of suggestions:
Add new option
lambda_name: X
which takes a string to be cast tolambda_X
, allows each restraint to be tied to either unique or existing values (including sterics and electrostatics)
In the implementation of the multiple AlchemicalRegion
s, we assigned a name
to the region object and the controlling variables assigned to the factory became, for example, lambda_sterics_name
, lambda_electrostatics_name
. We then have multiple AlchemicalState
assigned to control one region each. The protocol
section is already flexible enough to allow any possible protocol with multiple lambda names, but if we want shortcuts, the restraint could be tied to other lambda
values (e.g., sterics and electrostatics) through a mechanism similar to that implemented by AlchemicalFunction
s.
Is there a way we could expose these features within this framework instead of the lambda_name
keyword? I think we would benefit from making this consistent (if not YAML, at least at the API level) with the alchemical regions implementation.
I have no idea how to fix this unless we also define the order on which the restraints are suppose to apply.
I'm probably missing something, but if we pass a list of restraints, doesn't the list automatically define an order of restraints? Could we use that?
The main problem I see now with the current algorithm is that it's only possible to modify one lambda_X_name
at a time so we won't be able to turn on/off multiple restraints at the same time. This could be also worked around by using something similar to AlchemicalFunction
to enslave multiple lambda_restraint_names
variables to a single lambda_X
and trailblaze over the collective lambda_X
variable.
Is there a way we could expose these features within this framework instead of the lambda_name keyword?
I'm not sure I am following. The multiple regions was so we could select different sets of atoms, we still have no way to control the alchemical process though the alchemical regions block? Could you provide a psuedo-example maybe?
'm probably missing something, but if we pass a list of restraints, doesn't the list automatically define an order of restraints? Could we use that?
Kind of. The larger problem is that not all transformations are monotonic, or independent. We may want to have a restraint off, turn it on, then back off again after the fact. (e.g. the RMSD + harmonic restraint. RMSD starts and ends off so we can compute the standard state correction through the harmonic alone).
The main problem I see now with the current algorithm is that it's only possible to modify one
lambda_X_name
at a time so we won't be able to turn on/off multiple restraints at the same time.
You could change multiple at once. Since each lambda_X
would be its own entry in the protocol
, then you could have multiple lines.
This could be also worked around by using something similar to AlchemicalFunction to enslave multiple...
I have used similar things to this in the past and thought applying it here. However, I would avoid this if at all possible. This adds a layer of obfuscation which is only clear when the lambda path is not step-wise, otherwise it can be very confusing to the end user to have to map "sub lambda I care about Y = fn(Master Lambda)", especially if we have multiple lambda. Since we are exposing all the lambda values, there is no need to obfuscate their values or slave them to another variable.
I'm not sure I am following. The multiple regions was so we could select different sets of atoms, we still have no way to control the alchemical process though the alchemical regions block? Could you provide a psuedo-example maybe?
Yes, sorry, I meant more about the API structure more than semantics. It's not in YANK, but @adw62 and I have been working on it in openmmtools
(in his fork) so there's already a prototype API for this mechanism. This is more or less what the multiple regions API looks like right now
region1 = AlchemicalRegion(name='name1')
region2 = AlchemicalRegion(name='name2', lambda_sterics=0.8)
factory = AlchemicalFactory()
# The factory sets the forces global variables to lambda_sterics_name1 and lambda_sterics_name2.
system = factory.create_alchemical_system(reference_system, alchemical_regions=[region1, region2]
alchemical_state1 = AlchemicalState(region_name='name1', lambda_sterics=0.9)
alchemical_state2 = AlchemicalState(region_name='name2', lambda_sterics=0.8)
compound_state = CompoundState(..., composable_states=[alchemical_state1, alchemical_state2])
compound_state.lambda_sterics_name1 = 0.5
what I imagined for the restraints is something like
restraint1 = Harmonic(name='name1')
restraint2 = Boresch(name='name2')
# The restraint add forces controlled by lambda_restraints_name1 and lambda_restraints_name2
restraint1.restrain_state(thermodynamic_state)
restraint2.restrain_state(thermodynamic_state)
restraint_state1 = RestraintState(name='name1', lambda_restraints=0.9)
restraint_state2 = RestraintState(name='name2', lambda_restraints=0.8)
compound_state = CompoundState(..., composable_states=[restraint_state1, restraint_state2])
compound_state.lambda_restraints_name1 = 0.5
and to enslave the lambda_restraints
variables to a single one to use something like what already exist in alchemy
compound_state.lambda_restraints_name1 = LambdaFunction('lambda_X')
compound_state.lambda_restraints_name2 = LambdaFunction('1 - lambda_X')
compound_state.labmda_X = 0.6 # This changes both variables.
I already had in mind to have a single parent class for both AlchemicalState
and RestraintState
so this may be a good opportunity to tackle that.
I think keeping the two APIs as similar as possible will minimize the time the user will have to spend getting familiar with them.
Thanks for the example, that helped. I like this plan, but I still have questions which may be better resolved in person, and after I think about this some more from a design perspective. Let me think on it and get back to you.
Sounds good! I'll be back in the lab in a couple of days anyway.
I did some brainstorming on the states.py
framework, and I think we can accommodate the system we talked about with minimal modifications in CompoundThermodynamicState
provided we expose the dynamic lambda_restraints_arbitraryname
in __getattr__
.
I'll probably end up implementing an initial version of the common parent class of AlchemicalState
and RestraintState
too for testing, so we might as well take advantage of that.
I'll also have to modify the radially symmetric restraint force classes in openmmtools/forces.py
to be able to use global variable names other than lambda_restraints
.
Since we don't have an analytical correction for the RMSD restraint, we may need to introduce the ability to run with multiple restraints (e.g. RMSD + harmonic) so that we can turn off the rigidifying restraint at the end of the protocol while still reducing the space accessible by the ligand in the decoupled state and being able to bring the ligand to standard concentration.
The only idea I had so far that would make the design very extensible is to have a
CompoundRestraint
class that can be passed as a single restraint.Controlling multiple restraint at the same time is essentially the same problem of controlling multiple alchemical regions at the same time, so we should consider these two together. For now, the solution that I and @adw62 adopted for alchemical regions is to have each global variable controlled by a different
AlchemicalState
, which is assigned to a particular region so something similar may work also forRestraintState
.