michellab / BioSimSpace

Code and resources for the EPSRC BioSimSpace project.
https://biosimspace.org
GNU General Public License v3.0
77 stars 19 forks source link

Virtual sites #53

Open jmichel80 opened 5 years ago

jmichel80 commented 5 years ago

As far as I am aware Sire doesn't currently have data structures to describe virtual sites. Virtual sites are becoming more and more popular and are found in a growing number of topologies (e.g. GROMACS, OpenMM), not just for handling water models, mainly because new ligand forcefields make extensive use of v-sites to implement off-centre charges in MD simulations (e.g. OPLS, QUBE)

For instance in a current collaboration we are writing parsers to load in OpenMM XML topologies in Sire. While I think we can come up with a clunky solution to attach the parameters to a molecule using a property dictionary, I wonder if we should think about a more general solution to read/write virtual site(s).

chryswoods commented 5 years ago

A virtual site in Sire is an atom that has null properties for the things that are not necessary (eg bonds, LJ etc). We treat them in the same way as the M atom in TIP4/5P. You can use connectivity to connect them to atoms so that they are moved properly by intermolecular moves (if so desired) and, of course, they are correctly translated and rotated by rigid body moves. In actuality, an atom in Sire is really a virtual site on which you have added forcefield properties (as an Atom contains no intrinsic information other than its name and which CutGroup it belongs in in which molecule)

jmichel80 commented 5 years ago

It is also necessary to keep track of the parameters needed to compute the v-site coordinates from the position of other atoms.

See e.g. section 22.5 in http://docs.openmm.org/7.0.0/userguide/theory.html

´´´ OutOfPlaneSite: The virtual site location is computed as a weighted average of the positions of three particles and the cross product of their relative displacements: r= r

1

+ w

12

r

12

+ w

13

r

13

+ w

cross

( r

12

× r

13

)

r

r 1 + w 12 r 12 + w 13 r 13 + w c r o s s ( r 12 × r 13 ) where r12=r2−r1 r 12

r 2 − r 1 and r13=r3−r1 r 13

r 3 − r 1 . This allows the virtual site to be located outside the plane of the three particles. ´´´

Should we add this info as a custom atom property?

Sent from my iPhone

On 24 Feb 2019, at 08:05, Christopher Woods notifications@github.com wrote:

A virtual site in Sire is an atom that has null properties for the things that are not necessary (eg bonds, LJ etc). We treat them in the same way as the M atom in TIP4/5P. You can use connectivity to connect them to atoms so that they are moved properly by intermolecular moves (if so desired) and, of course, they are correctly translated and rotated by rigid body moves. In actuality, an atom in Sire is really a virtual site on which you have added forcefield properties (as an Atom contains no intrinsic information other than its name and which CutGroup it belongs in in which molecule)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

chryswoods commented 5 years ago

There are two options:

  1. We have SireMol::GeometryPerturbations that are used to compute and update the coordinates of atoms as constraints. Currently these are constraints applied based on bond, angle and dihedral values (used for single-topology perturbations), but could be sub-classed to apply more complex constraints. These can only act on atoms/points which are all in the same molecule. This derives from SireMol::Perturbation and then SireBase::Property, and is designed to be added a property of a molecule. Typically these should be collected together into a SireMol::Perturbations object as the "perturbations" property of the molecule. There is nothing to stop you reusing this code for virtual sites.

  2. We have SireSystem::DistanceComponent which provides constraints that operate on arbitrary algebraic expressions of up to three distance pairs, where the distances are calculated between "PointRef" objects that can be the coordinates of atoms, any of the "centre" types of molecules or atom selections, or an arbitrary means of calculating a point from a set of atomic coordinates. These can act between atoms/points across multiple molecules (as well as only single molecules if desired). This derives from SireSystem::GeometryConstraint, then SireSystem::Constraint, then SireBase::Property, and was designed to be added as a System property. Typically this would be packaged together into the SireSystem::Constraints container as the system's "constraints" property. There is nothing stopping you adding this as a molecule property that acted on specific atoms.

Both of the above are applied as constraints, e.g. after every move they are recalculated and reset consistently. They don't apply yet in somd as you would have to calculate possibly quite complex lagrangians. I think that the DistanceComponent constraint class is the better approach as this can handle generic PointRefs, which make it very easy to specify really very complex constraints (e.g. you would use a PointRef to refer to a weighted average of the positions of particles, then a DoubleDistanceComponent with an algebraic expression to refer to the cross-product of the distances.

The difficult part will not be how we implement them constraint, but rather how we can interpret our super-generic constraints back to write them out in the language of virtual sites etc. There isn't yet an interoperable way to represent this information in different codes, so I suspect we would just special-case for each type of virtual site we see and add a metadata tag to the property to say that it implements a virtual site of type X.

jmichel80 commented 5 years ago

I agree option 2 sounds like the way to go and that it’s not easy to support interoperability of virtual sites as it is still a bit the Wild West of molecular simulations.

I think that to meet deadlines we will have a go first at a simplistic implementation to get the data to SOMD and work out how to support v-sites for FEP. I think we are limited to support the default v-sites in OpenMM or else we would need to write our own plugin to handle force calculations.

Once we know how to make it work I will work on a rewrite to use SireSystem::DistanceComponent. There are a few other things that are overdue (e.g. use SireMM:amberparameters, switch to the new parser, make the implementation handle perturbations in a arbitrary collection of molecules, use groups, support PME, handle mixed MD/MC.) and I want to plan well an overhaul of the code.

Sent from my iPhone

On 24 Feb 2019, at 13:30, Christopher Woods notifications@github.com wrote:

There are two options:

We have SireMol::GeometryPerturbations that are used to compute and update the coordinates of atoms as constraints. Currently these are constraints applied based on bond, angle and dihedral values (used for single-topology perturbations), but could be sub-classed to apply more complex constraints. These can only act on atoms/points which are all in the same molecule. This derives from SireMol::Perturbation and then SireBase::Property, and is designed to be added a property of a molecule. Typically these should be collected together into a SireMol::Perturbations object as the "perturbations" property of the molecule. There is nothing to stop you reusing this code for virtual sites.

We have SireSystem::DistanceComponent which provides constraints that operate on arbitrary algebraic expressions of up to three distance pairs, where the distances are calculated between "PointRef" objects that can be the coordinates of atoms, any of the "centre" types of molecules or atom selections, or an arbitrary means of calculating a point from a set of atomic coordinates. These can act between atoms/points across multiple molecules (as well as only single molecules if desired). This derives from SireSystem::GeometryConstraint, then SireSystem::Constraint, then SireBase::Property, and was designed to be added as a System property. Typically this would be packaged together into the SireSystem::Constraints container as the system's "constraints" property. There is nothing stopping you adding this as a molecule property that acted on specific atoms.

Both of the above are applied as constraints, e.g. after every move they are recalculated and reset consistently. They don't apply yet in somd as you would have to calculate possibly quite complex lagrangians. I think that the DistanceComponent constraint class is the better approach as this can handle generic PointRefs, which make it very easy to specify really very complex constraints (e.g. you would use a PointRef to refer to a weighted average of the positions of particles, then a DoubleDistanceComponent with an algebraic expression to refer to the cross-product of the distances.

The difficult part will not be how we implement them constraint, but rather how we can interpret our super-generic constraints back to write them out in the language of virtual sites etc. There isn't yet an interoperable way to represent this information in different codes, so I suspect we would just special-case for each type of virtual site we see and add a metadata tag to the property to say that it implements a virtual site of type X.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.