openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io

http://openforcefield.org

MIT License

316 stars 92 forks source link

Large difference in 1-4 energies #1019

Closed callumjd closed 3 years ago

callumjd commented 3 years ago

I have followed the "Replacing ligand parameters in an already-parametrized system" procedure exactly, I am seeing a large difference between 1-4 elec and 1-4 vdw after applying openFF ligand parameters. Is this expected / known issue?

Running parmed:

Before parm BRD4-1.prmtop loadCoordinates BRD4-1.crds energy quit

gives: Bond = 93.2778504 Angle = 1678.5817639 Dihedral = 1550.2065868 1-4 vdW = 1213.0573270 1-4 Elec = 5617.5794490 vdWaals = 11318.5631489 Elec. = -116434.4232931 TOTAL = -94963.1571672

After parm complex.prmtop loadCoordinates complex.inpcrd energy quit

gives: Bond = 92.6749096 Angle = 1801.8589508 Dihedral = 1551.1422956 1-4 vdW = 9.7745874 1-4 Elec = -41.0324225 vdWaals = 11318.5631490 Elec. = -116434.4168318 TOTAL = -101701.4353620

https://github.com/openforcefield/openff-toolkit/blob/master/examples/swap_amber_parameters/swap_existing_ligand_parameters.ipynb

j-wags commented 3 years ago

Thanks for writing in, @callumjd. I'm able to reproduce this on my computer, and will start looking into the source of the difference.

j-wags commented 3 years ago

Still investigating, but one lead is that the complex.prmtop file has far more SCEE and SCNB_SCALE_FACTOR entries, many of which are 1.00E+10. IIRC, these values are the denominator for scaling 1-4 interactions, so having a humongous number there would explain why the magnitude of the 1-4 interactions is so reduced in the energies for that system.

If this is the case, then the next question is why these 1E+10s are being filled in there.

jchodera commented 3 years ago

Whoa. That is very much not good.

We'll have to see whether this is a ParmEd, OpenMM, or openff-toolkit issue, how far this extends back, and whether it impacts any of the ongoing benchmark results.

jchodera commented 3 years ago

Also, this seems like fairly important area for test coverage to be extended, with hindsight being 20/20. That would also help track down where this happened if we need to use git-bisect down the road.

j-wags commented 3 years ago

The ligand that gets openff parameters has 25 atoms, which, since we treat each as a unique "atom type" for compatibility with atom-type-based formats, should add 25 atom types. So I'd expect the prmtop sections describing interactions between atom types (with a size proportional to n_atom_types^2) to be quadratically larger.

The only places where ParmEd touches the value 1e10 is in this method, which handles cases where it may need to fill in 1-3 and 1-4 scaling values. 1e10 would be a reasonable value for a 1-3 scaling factor. So maybe the ParmEd-outputted system (complex.prmtop) has added 1-3 scaling values?

So, there should be a difference of 25 atom types somewhere, and potentially also the addition of 1-3 scaling factors. In the files I showed in the previous post, there are 196 entries for the SCNB section in complex.prmtop, and 5620 in system.prmtop. I'm playing with some math to confirm that there's a difference of 25 atom types.

(I've also been tracking down a bug in Topology.from_openmm where the atom order/positions may be scrambled in some cases (#1018). But I don't think that's the root cause here, since mis-ordered positions would show super high bond energies)

Need to run for a meeting, I'll try to get back to this today.

j-wags commented 3 years ago

I worked with @mattwthompson on this for a while today, and we'll meet back up tomorrow. Here's a summary of what we've found:

The minimum reproducing example is:

import requests
repo_url = 'https://raw.githubusercontent.com/MobleyLab/benchmarksets/master/input_files/'
sources = {
    'system.prmtop' : repo_url + 'BRD4/prmtop-coords/BRD4-1.prmtop',
    'system.crd'   : repo_url + 'BRD4/prmtop-coords/BRD4-1.crds',
    'ligand.sdf'   : repo_url + 'BRD4/sdf/ligand-1.sdf',
    'ligand.pdb'   : repo_url + 'BRD4/pdb/ligand-1.pdb'
}
for (filename, url) in sources.items():
    r = requests.get(url)
    open(filename, 'w').write(r.text)

#Read AMBER to ParmEd Structure object
import parmed
in_prmtop = 'system.prmtop'
in_crd = 'system.crd'
orig_structure = parmed.amber.AmberParm(in_prmtop, in_crd)

pieces = orig_structure.split()

from openff.toolkit.topology import Molecule, Topology
from simtk.openmm.app import PDBFile

ligand_off_molecule = Molecule('ligand.sdf')
ligand_pdbfile = PDBFile('ligand.pdb')
ligand_off_topology = Topology.from_openmm(ligand_pdbfile.topology, 
                                           unique_molecules=[ligand_off_molecule])

# Load the SMIRNOFF-format Parsley force field
from openff.toolkit.typing.engines.smirnoff import ForceField
force_field = ForceField('openff_unconstrained-1.0.0.offxml')

ligand_system = force_field.create_openmm_system(ligand_off_topology)
new_ligand_structure = parmed.openmm.load_topology(ligand_off_topology.to_openmm(),
                                                   ligand_system,
                                                   xyz=pieces[1][0].positions,
                                                  )

# Create a new, empty system
complex_structure = parmed.Structure()

# Add the protein
complex_structure += pieces[0][0]

# Add the ligand
complex_structure += new_ligand_structure

complex_structure.coordinates = orig_structure.coordinates[:len(complex_structure.atoms)]
complex_structure.box_vectors = orig_structure.box_vectors
complex_structure.save('protein_ligand.prmtop', overwrite=True)
complex_structure.save('protein_ligand.inpcrd', overwrite=True)

echo "parm protein_ligand.prmtop" > parmed.in
echo "loadCoordinates protein_ligand.inpcrd" >> parmed.in
echo "energy" >> parmed.in
echo "quit" >> parmed.in
parmed -i parmed.in

Experiments

If we just import the original file, slice out the protein, and write it to an AMBER prmtop (no solvent or ligand), the energies look reasonable.
If we rename the OpenFF atoms so that they don't collide with existing protein atom names, the problem isn't fixed.
Adding the new ParmEd keyword condense_atom_types=False to the parmed.openmm.load_topology call doesn't fix it.
Running complex_structure.update_dihedral_exclusions() before writing the system doesn't fix it.
Reversing the order of component addition (protein vs. ligand first) doesn't fix it.

If, instead of


# Create a new, empty system
complex_structure = parmed.Structure()

Add the protein

complex_structure += pieces[0][0]

Add the ligand

complex_structure += new_ligand_structure


we run

```python
# Create a new, empty system
# complex_structure = parmed.Structure()

# Add the protein
complex_structure = pieces[0][0]

# Add the ligand
complex_structure += new_ligand_structure

then some characteristics of the original AMBER structure (like 1-4 scaling factors, I suspect) get inherited, and the final energies look a lot better:

Computing a single-point energy for complex.prmtop
Bond     =           92.6749096     Angle    =         1801.8589508
Dihedral =         1551.1422956     1-4 vdW  =       683015.2642038
1-4 Elec =         5126.4418463     vdWaals  =        11318.5631490
Elec.    =      -116419.1903971
TOTAL    =       586486.7549580

This gets most terms reasonably close, except for a big discrepancy in 1-4 vdW. This is what I'll start working on tomorrow.

Observations

GROMACS is complaining about a net charge of 3.0000 on the resulting system. Are the ions not balanced? Is the protonation state of the ligand being misinterpreted?

Other Possibilities

Is it possible that ParmEd can't tell the difference between propers and impropers, and so some 1-3 pairs (which should have scaling of 0) are getting the 1-4 scale value instead?
Are old ligand atom types are somehow sticking around getting mangled with new atom types in ParmEd?
Are 1-3 interactions are somehow being added to the 1-4 section in the new file, then not getting matched up correctly in the final system?

j-wags commented 3 years ago

Ok, so two changes are making these numbers look more believable, though I'd like to do more rigorous validation next week:

The change from my previous post, where we don't begin assembling the combined system with a blank parmed.Structure, but instead start with a copy of pieces[0][0] (the protein from the original input), which is type AmberParm. This method seems to preserve the original 1-4 scalings instead of setting them to 1e10.
Adding complex_structure.update_dihedral_exclusions() right before writing the prmtop and inpcrd files

So, my current reproducing example, which now includes the ions and water, and gets pretty close to the intended energies, is:

import requests
repo_url = 'https://raw.githubusercontent.com/MobleyLab/benchmarksets/master/input_files/'
sources = {
    'system.prmtop' : repo_url + 'BRD4/prmtop-coords/BRD4-1.prmtop',
    'system.crd'   : repo_url + 'BRD4/prmtop-coords/BRD4-1.crds',
    'ligand.sdf'   : repo_url + 'BRD4/sdf/ligand-1.sdf',
    'ligand.pdb'   : repo_url + 'BRD4/pdb/ligand-1.pdb'
}
for (filename, url) in sources.items():
    r = requests.get(url)
    open(filename, 'w').write(r.text)

#Read AMBER to ParmEd Structure object
import parmed
in_prmtop = 'system.prmtop'
in_crd = 'system.crd'
orig_structure = parmed.amber.AmberParm(in_prmtop, in_crd)

pieces = orig_structure.split()

from openff.toolkit.topology import Molecule, Topology
from simtk.openmm.app import PDBFile

ligand_off_molecule = Molecule('ligand.sdf')
ligand_pdbfile = PDBFile('ligand.pdb')
ligand_off_topology = Topology.from_openmm(ligand_pdbfile.topology, 
                                           unique_molecules=[ligand_off_molecule])

# Load the SMIRNOFF-format Parsley force field
from openff.toolkit.typing.engines.smirnoff import ForceField
force_field = ForceField('openff_unconstrained-1.0.0.offxml')

ligand_system = force_field.create_openmm_system(ligand_off_topology)
new_ligand_structure = parmed.openmm.load_topology(ligand_off_topology.to_openmm(),
                                                   ligand_system,
                                                   xyz=pieces[1][0].positions,
                                                   condense_atom_types=False
                                                  )

# Create a new, empty system
#complex_structure = parmed.Structure()

import copy
# Add the protein
complex_structure = copy.deepcopy(pieces[0][0])

# Add the ligand
complex_structure += new_ligand_structure

# Add ions
just_ion1_structure = parmed.Structure()
just_ion1_structure += pieces[2][0]
just_ion1_structure *= len(pieces[2][1])

just_ion2_structure = parmed.Structure()
just_ion2_structure += pieces[3][0]
just_ion2_structure *= len(pieces[3][1])

complex_structure += just_ion1_structure
complex_structure += just_ion2_structure

# Add waters

just_water_structure = parmed.Structure()
just_water_structure += pieces[4][0]
just_water_structure *= len(pieces[4][1])

complex_structure += just_water_structure

complex_structure.coordinates = orig_structure.coordinates
complex_structure.box_vectors = orig_structure.box_vectors
complex_structure.update_dihedral_exclusions()
complex_structure.save('whole_shebang.prmtop', overwrite=True)
complex_structure.save('whole_shebang.inpcrd', overwrite=True)

Checking with ParmEd looks pretty close to the original system energies too:

echo "parm whole_shebang.prmtop" > parmed.in
echo "loadCoordinates whole_shebang.inpcrd" >> parmed.in
echo "energy" >> parmed.in
echo "quit" >> parmed.in
parmed -i parmed.in

Computing a single-point energy for whole_shebang.prmtop
Bond     =           92.6749096     Angle    =         1801.8589508
Dihedral =         1551.1422956     1-4 vdW  =         1222.8319144
1-4 Elec =         5575.5902675     vdWaals  =        11318.5631490
Elec.    =      -116419.1903971
TOTAL    =       -94856.5289103

So, the things to pick up on next week are:

Are the 1-4 Elec and vdW terms in the combined system exactly the sum of the protein's and ligand's 1-4 terms? They're the only molecules in the system with 1-4 pairs so this addition should work out.
Starting with an AmberParm restricts us to AMBER output. Is there any way to write out to GROMACS format like in the original example?
(from my previous post) Why did GROMACS think this system had a net charge of +3?
Why is the vdWaals term in the final system precisely equal to the system before parameter replacement? Is this due to the AMBER lineage of the openff-1.0.0 force field?

jchodera commented 3 years ago

It's looking like this may come down to the implementation choices made in the ParmEd Structure += operator, which copies a bunch of terms, but has to make decisions about preserving 1-4 scale factors. Definitely worth raising an issue there so that there can be a more detailed discussion of what the behavior should be and whether there should be warnings issued in cases where it would do unexpected things.

callumjd commented 3 years ago

Thanks for digging into this. For what it's worth, if starting from an empty Structure, you can convert either piece to type AmberParm:

complex_structure = parmed.Structure()

complex_structure += parmed.amber.AmberParm.from_structure(pieces[0][0])

By doing this I am getting correct agreement of energies of the saved prmtop.

j-wags commented 3 years ago

Hm, @callumjd I'm getting "bad" values when I use the code in your last post. Could you double check?

Here's my reproducing code with your suggested change:

import requests
repo_url = 'https://raw.githubusercontent.com/MobleyLab/benchmarksets/master/input_files/'
sources = {
    'system.prmtop' : repo_url + 'BRD4/prmtop-coords/BRD4-1.prmtop',
    'system.crd'   : repo_url + 'BRD4/prmtop-coords/BRD4-1.crds',
    'ligand.sdf'   : repo_url + 'BRD4/sdf/ligand-1.sdf',
    'ligand.pdb'   : repo_url + 'BRD4/pdb/ligand-1.pdb'
}
for (filename, url) in sources.items():
    r = requests.get(url)
    open(filename, 'w').write(r.text)

#Read AMBER to ParmEd Structure object
import parmed
in_prmtop = 'system.prmtop'
in_crd = 'system.crd'
orig_structure = parmed.amber.AmberParm(in_prmtop, in_crd)

pieces = orig_structure.split()

from openff.toolkit.topology import Molecule, Topology
from simtk.openmm.app import PDBFile

ligand_off_molecule = Molecule('ligand.sdf')
ligand_pdbfile = PDBFile('ligand.pdb')
ligand_off_topology = Topology.from_openmm(ligand_pdbfile.topology, 
                                           unique_molecules=[ligand_off_molecule])

# Load the SMIRNOFF-format Parsley force field
from openff.toolkit.typing.engines.smirnoff import ForceField
force_field = ForceField('openff_unconstrained-1.0.0.offxml')

ligand_system = force_field.create_openmm_system(ligand_off_topology)
new_ligand_structure = parmed.openmm.load_topology(ligand_off_topology.to_openmm(),
                                                   ligand_system,
                                                   xyz=pieces[1][0].positions,
                                                   condense_atom_types=False
                                                  )

# Create a new, empty system
complex_structure = parmed.Structure()

import copy
# Add the protein
complex_structure += parmed.amber.AmberParm.from_structure(pieces[0][0])

# Add the ligand
complex_structure += new_ligand_structure

# Add ions
just_ion1_structure = parmed.Structure()
just_ion1_structure += pieces[2][0]
just_ion1_structure *= len(pieces[2][1])

just_ion2_structure = parmed.Structure()
just_ion2_structure += pieces[3][0]
just_ion2_structure *= len(pieces[3][1])

complex_structure += just_ion1_structure
complex_structure += just_ion2_structure

# Add waters

just_water_structure = parmed.Structure()
just_water_structure += pieces[4][0]
just_water_structure *= len(pieces[4][1])

complex_structure += just_water_structure

complex_structure.coordinates = orig_structure.coordinates
complex_structure.box_vectors = orig_structure.box_vectors
complex_structure.update_dihedral_exclusions() # (This doesn't make a difference)
complex_structure.save('whole_shebang.prmtop', overwrite=True)
complex_structure.save('whole_shebang.inpcrd', overwrite=True)

echo "parm whole_shebang.prmtop" > parmed.in
echo "loadCoordinates whole_shebang.inpcrd" >> parmed.in
echo "energy" >> parmed.in
echo "quit" >> parmed.in
parmed -i parmed.in

outputs:

Bond     =           92.6749096     Angle    =         1801.8589508
Dihedral =         1551.1422956     1-4 vdW  =            9.7745874
1-4 Elec =          -69.1106003     vdWaals  =        11318.5631490
Elec.    =      -116419.1903971
TOTAL    =      -101714.2871050

callumjd commented 3 years ago

Hi @j-wags , apologies, I should've been more explicit. Using parmed to convert the Structure to AmberParm each time you append to the complex_structure works:

# Create a new, empty system
complex_structure = parmed.Structure()

import copy
# Add the protein
complex_structure += parmed.amber.AmberParm.from_structure(pieces[0][0])

# Add the ligand
complex_structure += parmed.amber.AmberParm.from_structure(new_ligand_structure)

# Add ions
just_ion1_structure = parmed.Structure()
just_ion1_structure += pieces[2][0]
just_ion1_structure *= len(pieces[2][1])

just_ion2_structure = parmed.Structure()
just_ion2_structure += pieces[3][0]
just_ion2_structure *= len(pieces[3][1])

complex_structure += parmed.amber.AmberParm.from_structure(just_ion1_structure)
complex_structure += parmed.amber.AmberParm.from_structure(just_ion2_structure)

# Add waters

just_water_structure = parmed.Structure()
just_water_structure += pieces[4][0]
just_water_structure *= len(pieces[4][1])

complex_structure += parmed.amber.AmberParm.from_structure(just_water_structure)

complex_structure.coordinates = orig_structure.coordinates
complex_structure.box_vectors = orig_structure.box_vectors
complex_structure.update_dihedral_exclusions() # (This doesn't make a difference)
complex_structure.save('whole_shebang.prmtop', overwrite=True)
complex_structure.save('whole_shebang.inpcrd', overwrite=True)

Gives:

Bond     =           92.6749096     Angle    =         1801.8589508
Dihedral =         1551.1422956     1-4 vdW  =         1213.0573272
1-4 Elec =         5617.4905656     vdWaals  =        11318.5631490
Elec.    =      -116434.4168318
TOTAL    =       -94839.6296341
Done!

j-wags commented 3 years ago

Oh, right you are! Thanks for the clarification.

j-wags commented 3 years ago

I've been working this over with @mattwthompson during my available time the last few weeks, and there are still some loose threads I'd to get to the end of, but I don't have time to do that before our next release. Since @callumjd's suggestion is above is a clear improvement (and very possibly totally correct), I'm going to go ahead and update the example to convert everything to AmberParms as they get added to complex_structure

I'm attaching my debugging notebook here, since it's the most complete breadcrumb trail for picking this search up again later.

swap_existing_ligand_parameters_debugging.ipynb.tar.gz

callumjd commented 3 years ago

@j-wags in the resulting "whole_shebang.prmtop" there is a mix of SCNB=1.0 and SCNB=2.0 scaling factors (also mix of SCEE=1.2 and SCEE=1.0). Should we be forcing everything to have the "amber" scaling factors like this? Or does OpenFF ligand require SCNB=1.0 and SCEE=1.0?

# FIXME: ParmEd does not assign scee and scnb to the dihedrals
# If this block is commented out, energies won't match between
# ParmEd and tleap -generated PRMTOPS! I guess this is a bug in
# ParmEd because the generated FRCMOD files are incorrect.
if with_fixes:
    for dihedral in structure.dihedral_types:
        dihedral.scee = 1.2
        dihedral.scnb = 2.0

taken from here: https://github.com/openforcefield/openff-toolkit/issues/440

j-wags commented 3 years ago

Yeah, that's why I'm leaving this issue open... I'm fairly certain that the changes in #1033 make the example "less wrong" so I'd like those to be packaged in our next release, but I'm not convinced that these numbers are "less wrong" enough to be "right".

So with the SCEE and SCNB scaling factors, I know the following:

The OpenFF ligand should have 1-4 scaling factors of 1.2 and 2.0
OpenMM Systems don't draw a distinction between proper and improper torsions
AMBER expects every unique dihedral type to have explicit 1-4 scaling factors

So my thinking is that ParmEd reads an OpenMM system and has to pick apart which dihedrals are propers vs. impropers. But to be safe, it stores them all. Then, when it writes out to the prmtop, the impropers wind up next to the propers in the "dihedrals" list (which gives them entries in SCEE and SCNB as well). But the improper scaling factors should be special, since the first and last atoms in an improper aren't 1-4 to each other, they're 1-3.

BUT, if they were 1-3, then I'd expect their scaling factor to be set in such a way as to zero out the final energy, so the scaling factor should be 1e10 or something. However the "weird" scaling factors are printed as 1.0. BUT, if the SCEE=1.0 scaling factors were really being used, I'd expect really large ligand electrostatics and vdW energies, which I don't see in the Just ligand as AmberParm energy evaluation from the notebook. So my running hypothesis is that sander is ignoring the 1.0 scaling factors and zeroing out the 1-3 interaction energies, which would yield the "correct" behavior that we want.

The other possibility is that some of the ligand propers are, for some reason, getting 1-4 scaling factors of 1.0, and that's what I'd like to look into next.

Basically, what we do know is that

(protein+AmberParm(ligand) system's 1-4 energy components as evaluated by sander) \= (protein system's 1-4 energy components as evaluated by sander) + (AmberParm(ligand) system's 1-4 energy components as evaluated by sander).

This is reassuring, but it doesn't rule out that there's some mangling happening in the AmberParm(ligand) construction or the as evaluated be sander part.

I think the most concise way to check the remaining uncertainties is to see whether (AmberParm(ligand) system's 1-4 energy components as evaluated by sander) \= (ligand system's 1-4 energy components as evaluated by OpenMM)

which I'll start looking into after the 0.10.0 release (since it's blocking a lot of work internally)

j-wags commented 3 years ago

Alright, I've validated that the ligand's energy as evaluated by OpenMM matches the ligand's energy as evaluated by sander when I set the nonbonded methods to be PME (and that each individual component agrees). So I think we can say the the ligand's energy in the combined systems made by this workflow really are correct.

Getting the sander ligand energy with PME

# Create a new, empty system
complex_structure = parmed.Structure()

# Add the ligand
complex_structure += parmed.amber.AmberParm.from_structure(new_ligand_structure)

# Add the protein
#complex_structure += pieces[0][0]

import numpy as np

n_protein_atoms = len(pieces[0][0].atoms)
n_ligand_atoms = len(new_ligand_structure.atoms)

complex_structure.coordinates = orig_structure.coordinates[n_protein_atoms:
                                                           n_protein_atoms + n_ligand_atoms]
complex_structure.box_vectors = orig_structure.box_vectors
complex_structure.save('ligand_as_amberparm.prmtop', overwrite=True)
complex_structure.save('ligand_as_amberparm.inpcrd', overwrite=True)

!echo "parm ligand_as_amberparm.prmtop" > parmed.in
!echo "loadCoordinates ligand_as_amberparm.inpcrd" >> parmed.in
!echo "energy Ewald nodisper" >> parmed.in
!echo "quit" >> parmed.in
!parmed -i parmed.in

yields

...
Computing a single-point energy for ligand_as_amberparm.prmtop
Bond     =           10.0616164     Angle    =          148.3914518
Dihedral =            4.2740128     1-4 vdW  =            9.7745872
1-4 Elec =          -41.0055802     vdWaals  =           -2.1825966
Elec.    =            2.1506644
TOTAL    =          131.4641557
Done!

Getting the OpenMM ligand (in a way that doesn't send info through parmed except for coordinates)

import copy

# Turn off electrostatics 1-4s and print difference
modified_ligand_system = copy.deepcopy(ligand_system)

nbf = [force for force in modified_ligand_system.getForces() if type(force) is openmm.NonbondedForce][0]
for exc_idx in range(nbf.getNumExceptions()):
    p1, p2, q, r, d = nbf.getExceptionParameters(exc_idx)
    nbf.setExceptionParameters(exc_idx, p1, p2, 0, r, d)
results = compare_system_energies(ligand_system, 
                                  modified_ligand_system, 
                                  pieces[1][0].positions, 
                                  rtol=100, atol=100)
# print valence terms on the first go-through
print('bonds', results[0]['HarmonicBondForce'].in_units_of(unit.kilocalorie/unit.mole))
print('angles', results[0]['HarmonicAngleForce'].in_units_of(unit.kilocalorie/unit.mole))
print('torsions', results[0]['PeriodicTorsionForce'].in_units_of(unit.kilocalorie/unit.mole))
print('electrostatics 1-4', (results[0]['NonbondedForce']-results[1]['NonbondedForce']).in_units_of(unit.kilocalorie/unit.mole))

# Turn off vdw 1-4s and print difference
modified_ligand_system = copy.deepcopy(ligand_system)
nbf = [force for force in modified_ligand_system.getForces() if type(force) is openmm.NonbondedForce][0]
for exc_idx in range(nbf.getNumExceptions()):
    p1, p2, q, r, d = nbf.getExceptionParameters(exc_idx)
    nbf.setExceptionParameters(exc_idx, p1, p2, q, r, 0)
results = compare_system_energies(ligand_system, 
                                  modified_ligand_system, 
                                  pieces[1][0].positions,
                                  rtol=100, atol=100)
print('vdw 1-4', (results[0]['NonbondedForce']-results[1]['NonbondedForce']).in_units_of(unit.kilocalorie/unit.mole))

# Turn off electrostatics and print difference
modified_ligand_system = copy.deepcopy(ligand_system)
#print(dir(modified_ligand_system))
nbf = [force for force in modified_ligand_system.getForces() if type(force) is openmm.NonbondedForce][0]
#print([i for i in dir(nbf) if 'xce' in i])
for p_idx in range(nbf.getNumParticles()):
    q, r, d = nbf.getParticleParameters(p_idx)
    nbf.setParticleParameters(p_idx, 0, r, d)
results = compare_system_energies(ligand_system, 
                                  modified_ligand_system, 
                                  pieces[1][0].positions,
                                  rtol=100, atol=100)
print('electrostatics', (results[0]['NonbondedForce']-results[1]['NonbondedForce']).in_units_of(unit.kilocalorie/unit.mole))

# Turn off vdW and print difference
modified_ligand_system = copy.deepcopy(ligand_system)
nbf = [force for force in modified_ligand_system.getForces() if type(force) is openmm.NonbondedForce][0]
for p_idx in range(nbf.getNumParticles()):
    q, r, d = nbf.getParticleParameters(p_idx)
    nbf.setParticleParameters(p_idx, q, r, 0)
results = compare_system_energies(ligand_system, 
                                  modified_ligand_system, 
                                  pieces[1][0].positions,
                                  rtol=100, atol=100)
print('vdw', (results[0]['NonbondedForce']-results[1]['NonbondedForce']).in_units_of(unit.kilocalorie/unit.mole))

yields

bonds 10.061340623786299 kcal/mol
angles 148.3913151757439 kcal/mol
torsions 4.274002680587039 kcal/mol
electrostatics 1-4 -41.00698084493896 kcal/mol
vdw 1-4 9.774545410623057 kcal/mol
electrostatics 2.1530100310053926 kcal/mol
vdw -2.183389025482119 kcal/mol

So, I think that the new pathway does successfully send the ligand parameters through ParmEd, and the previous posts show that the protein energy is preserved, so the combined system is likely well-behaved. The small differences in nonbonded components are probably due to differences in PME internals. If I had more patience, I could

figure out how to do both evaluations using cutoff nonbonded to get closer precision than PME
really dig into the prmtop file format to see why there are a small number of 0's in the outputted SCEE and SCNB blocks

but the above comparison shows that there is no significant difference in the outcome.

My now-much-longer scratch notebook for this issue, which have useful breadcrumbs for future ParmEd validation. (Woe be unto ye if you actually need to use this!) swap_existing_ligand_parameters_scratch.ipynb.gz

Any objections to closing this issue?

mattwthompson commented 3 years ago

No objections from me - I think the energy differences observed here (0.003 kcal/mol energy differences resulting from PME and other errors decades smaller) are well within an acceptable error tolerance.

callumjd commented 3 years ago

Looks great, thanks for the comprehensive look @j-wags . My simulations in amber using openFF are no longer blowing up :)