mosdef-hub / gmso

Flexible storage of chemical topology for molecular simulation
https://gmso.mosdef.org
MIT License
52 stars 32 forks source link

Support tabulated potentials? #79

Open mikemhenry opened 5 years ago

mikemhenry commented 5 years ago

It would be killer to support this with the sympy stuff: https://hoomd-blue.readthedocs.io/en/stable/module-md-pair.html#hoomd.md.pair.table

Not sure if it is feasible or not, maybe if the functional form doesn't exist, it could warn then use tabulated? If hoomd supported tabulated bonds and angles, then you could do pretty arbitrary forms

mattwthompson commented 5 years ago

Yes, it's a major motivation of this package and something we necessarily intend to support. Not quite sure how to do it, as most of the sympy stuff should be ignored in a tabulated potential. But in principle we should just store an r array and a U(r) array - in my head this should be straightforward.

ahy3nz commented 5 years ago

I'm not sure how costly it would be to store a tabulated potential in memory, but it seems easy enough to call sympFunction.evalf({var_dict}) for a range of vars. Hoomd doc for nonbonded tabulated pair.

If we follow hoomd, we need r (the independent var), U(r), and then the force/gradient (fortunately for us, sympy can compute gradients) as a function of r

mikemhenry commented 5 years ago

Oh that's nice, I'm embarrassed to admit I've made mistakes before when calculating a gradient of an analytical form

On Mon, Mar 11, 2019 at 2:20 PM Alex Yang notifications@github.com wrote:

I'm not sure how costly it would be to store a tabulated potential in memory, but it seems easy enough to call sympFunction.evalf({var_dict}) for a range of vars. Hoomd doc for nonbonded tabulated pair https://hoomd-blue.readthedocs.io/en/stable/module-md-pair.html#hoomd.md.pair.table .

If we follow hoomd, we need r (the independent var), U(r), and then the force/gradient (fortunately for us, sympy can compute gradients) as a function of r

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mattwthompson/topology/issues/79#issuecomment-471683592, or mute the thread https://github.com/notifications/unsubscribe-auth/ALOI3v_SOwv3JaQfXY-TzVO62qD0sodGks5vVqxrgaJpZM4bpTAv .

mattwthompson commented 5 years ago

I mean these are typically going to be 1-D arrays of length few hundred to maybe a few thousand. This would be 3 (r, U(r), F(r)) of these arrays for every atom type and, optionally, custom cross-interactions. I guess this could, in the limit of a wild system, approach the order of a million or so lines of 1-D arrays, but this does not seem dauntingly large on paper, no?

ahy3nz commented 5 years ago

I think this is sort of similar to computing NB potentials - we don't enumerate all the NB pairs, but we know how to enumerate all the NB pairs when it comes to writing. Similarly, do we need to enumerate all the tabulated potentials, or only enumerate upon i/o?

mattwthompson commented 5 years ago

I'm pretty sure ParmEd stores the cross-interactions only if it is told to, and can generate the cross-interactions at the point of the writer, if necessary. Or at least this is how I think we should try to handle it

ahy3nz commented 5 years ago

Discussion so far: analogous to Potential and a variety of subclasses BondType(Potential), AngleType(Potential), we are thinking of a class TabulatedPotential with respective subclasses. Potential and TabulatedPotential will be different classes, but share similar inheritance.

Notes: Implementations of TabulatedPotentials will probably require a way compute gradients (we are unsure the best way to approach systematically/conveniently computing the gradient since these methods tend to differ among users). At the very least, we'd have to have arrays of independent_variables and its associated energy

mattwthompson commented 5 years ago

Are there any engines that require only r and U(r) but not F(r)? Because a starting point would be easier to put together if we could get away with that.

joaander commented 4 years ago

HOOMD requires both V(r) and -1/r dV/dr and corresponding derivatives for bond, angle, and dihedral potentials. Given only a discretized V(r) on a grid, there is no way to compute the force without making some big assumptions and approximations about the form of the potential.

mattwthompson commented 4 years ago

Could you elaborate on the last point? Is it too risky to use a finite difference to estimate forces? If this is problematic, we could just require that any tabulated potential must include forces, instead of leaving them optional.

joaander commented 4 years ago

I was referring to some other form of assumption, such as treating the potential as a cubic spline. There is nothing wrong with cubic splines, but any assumptions about the functional form should be made explicit.

As for using something like finite differences, why use a approximation that is not even valid at the end points when the analytical value is known? I haven't tested it explicitly, but I would be concerned about the quality of energy conservation in a simulation using forces derived from finite differences. HOOMD adopts the "explicit is better than implicit" strategy and requires forces along with energies. If users wish to compute their forces from finite differences or some other method, they are free, but not required, to do so. Given the massive interest in tabulated potentials, we could put some work into validating and improving them in HOOMD v3. The existing implementation is very ad-hoc.

In any case, other engines make different assumptions and perform interpolation differently (e.g. interpolating on r vs r**2). While the sympy expression method is general and can be extended to any of these backends with no data loss, an explicit table storage method (needed for potentials that don't have analytical forms) needs to clearly define whether there are restrictions on the type of r values. If there are no restrictions, we will need to define how the given function is interpolated onto the final grid used by the simulation.

mattwthompson commented 4 years ago

Maybe we should separate out these ideas into discrete modules/functions. So the tabulated potential class would need to be passed in some minimum amount of information to be reasonable (at least r, V(r), and some derivative that defines the force, possibly other terms). But also include some set of other functions that can modify these, while making explicit the potential pitfalls of each method.