choderalab / perses

Experiments with expanded ensembles to explore chemical space
http://perses.readthedocs.io
MIT License
181 stars 51 forks source link

RESTCapableHybridTopologyFactory energy validation fails for small molecules in solvent #1098

Open zhang-ivy opened 2 years ago

zhang-ivy commented 2 years ago

When helping @ijpulidos with https://github.com/choderalab/perses/pull/1065, we noted that when generating a RESTCapableHybridTopologyFactory for CCC->CCCC or CCCC->CCC, the endstate energy validation test (validate_endstate_energy_point) was failing due to an energy mismatch in the nonbonded energies:

HarmonicBondForce -- og: 0.2884397711171602, hybrid: 0.2884397711171602
HarmonicAngleForce -- og: 0.2349820210951548, hybrid: 0.2349820210951548
PeriodicTorsionForce -- og: 0.02640257025445306, hybrid: 0.02640273818280852
Nonbondeds -- og: 2204.7912252873475, hybrid: 2204.905449988426
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Input In [7], in <cell line: 3>()
      1 # original
      3 for endstate in [0, 1]:
----> 4     validate_endstate_energies_point(htf, endstate=endstate)

File ~/miniconda3/envs/perses-paper4-dev/lib/python3.9/site-packages/perses/tests/utils.py:992, in validate_endstate_energies_point(input_htf, endstate, minimize)
    987 nonbonded_hybrid_values = [components_hybrid[key] for key in components_hybrid.keys()
    988                            if key not in bonded_keys_other_to_hybrid.values()]
    989 print(
    990     f"Nonbondeds -- og: {components_other['NonbondedForce']}, hybrid: {np.sum(nonbonded_hybrid_values)}"
    991 )
--> 992 assert np.isclose([components_other['NonbondedForce']], np.sum(nonbonded_hybrid_values))
    994 print(f"Success! Energies are equal at lambda {endstate}!")

AssertionError: 

I investigated and found that the energy mismatch comes from the dispersion/long range correction. If I turn the dispersion correction off in the old and new systems (before generating the htf), then the test passes.

I next wondered whether the energy mismatch arises because I split up the steric interactions into separate forces -- one for scaled interactions (CustomNonbondedForce_sterics) and one for nonscaled interactions (NonbondedForce_sterics). Splitting up the interactions across two forces could cause the long range correction to be computed slightly differently than if all interactions were computed in one force.

To test this, I commented out these lines that add interaction groups to the CustomNonbondedForce_sterics so that the CustomNonbondedForce_sterics now contains all scaled and nonscaled interactions. I then generated a htf, deleted the NonbondedForce_sterics (previously used for nonscaled steric interactions), and then ran the energy validation test and it passed.

HarmonicBondForce -- og: 0.2884397711171602, hybrid: 0.2884397711171602
HarmonicAngleForce -- og: 0.2349820210951548, hybrid: 0.2349820210951548
PeriodicTorsionForce -- og: 0.02640257025445306, hybrid: 0.02640273818280852
Nonbondeds -- og: 2204.791225287342, hybrid: 2204.7922074738963
Success! Energies are equal at lambda 0!
HarmonicBondForce -- og: 0.16658791561165553, hybrid: 0.16658791561165553
HarmonicAngleForce -- og: 213.64543195162605, hybrid: 213.64543195162597
PeriodicTorsionForce -- og: 0.02748946113190618, hybrid: 0.027489572865829537
Nonbondeds -- og: 2206.729493784189, hybrid: 2206.7304108909084
Success! Energies are equal at lambda 1!

Note that this long range correction energy discrepancy only seems to affect the solvent phase for small molecules, not the complex phase. For example, if I run bace2 with ligands 9 and 15 (in data/bace_example/Bace_ligands_shifted.sdf), the test passes:

HarmonicBondForce -- og: 1964.2670795763452, hybrid: 1964.2670795763452
HarmonicAngleForce -- og: 5112.6081405340865, hybrid: 5112.608140534087
PeriodicTorsionForce -- og: 8121.348763932069, hybrid: 8121.350440506907
Nonbondeds -- og: -339386.6319977214, hybrid: -339386.6401120513
Success! Energies are equal at lambda 0!
HarmonicBondForce -- og: 1967.2323969685465, hybrid: 1967.2323969685465
HarmonicAngleForce -- og: 5129.712507300623, hybrid: 5129.712507300623
PeriodicTorsionForce -- og: 8122.048463707788, hybrid: 8122.05014139802
Nonbondeds -- og: -339271.6857312901, hybrid: -339271.6938485515
Success! Energies are equal at lambda 1!

The nonbonded energies are large enough here that the long range correction discrepancy isn't noticeable, so the test passes.

However, the energy discrepancy is present when I run solvent phase for the same Bace2 ligands:

HarmonicBondForce -- og: 4.885286522448572, hybrid: 4.885286522448571
HarmonicAngleForce -- og: 31.586481788165568, hybrid: 31.586481788165568
PeriodicTorsionForce -- og: 10.842469171597777, hybrid: 10.842479000280184
Nonbondeds -- og: 1961.5807174908334, hybrid: 1961.7100125097277
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Input In [3], in <cell line: 5>()
      2 from perses.tests.utils import validate_endstate_energies_point, validate_endstate_energies_md
      5 for endstate in [0, 1]:
----> 6     validate_endstate_energies_point(htf, endstate=endstate)

File ~/miniconda3/envs/perses-paper4-dev/lib/python3.9/site-packages/perses/tests/utils.py:992, in validate_endstate_energies_point(input_htf, endstate, minimize)
    987 nonbonded_hybrid_values = [components_hybrid[key] for key in components_hybrid.keys()
    988                            if key not in bonded_keys_other_to_hybrid.values()]
    989 print(
    990     f"Nonbondeds -- og: {components_other['NonbondedForce']}, hybrid: {np.sum(nonbonded_hybrid_values)}"
    991 )
--> 992 assert np.isclose([components_other['NonbondedForce']], np.sum(nonbonded_hybrid_values))
    994 print(f"Success! Energies are equal at lambda {endstate}!")

AssertionError: 

I am raising this issue so that we can discuss this at the next dev sync. I'm not sure if we want to avoid running small molecule transformations with the RESTCapableHybridTopologyFactory in the future, or if we just want to make a note in the docstring of the class / in the tests that this is a known problem.

zhang-ivy commented 2 years ago

Note that Iván generated htfs using the vanilla HybridTopologyFactory for CCC->CCCC and the energy validation test yields the following results:

with dispersion correction off:

zero state error: 1.96084504011651e-05, one state error: 0.0005264199693243432

with dispersion correction on:

zero state error: -0.8331827929622193, one state error: -0.6370180463873197

so I think the dispersion correction energy discrepancy is likely a problem in the HybridTopologyFactory as well, since there are two forces handling steric interactions as well.

zhang-ivy commented 2 years ago

We discussed this at the perses dev sync 9/6/22 and we decided that we should keep this issue open to address this discrepancy in the future.