OpenFreeEnergy / openfe

The Open Free Energy toolkit
https://docs.openfree.energy
MIT License
138 stars 20 forks source link

Warning error during equilibration!! #357

Closed mayank-kohli closed 1 year ago

mayank-kohli commented 1 year ago

Hi,

I am getting this warning message during equilibration step and after some time the equilibration step kills and give same repeating error. WARNING: Potential energy is NaN after 0 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart...

Is there any solution with respect to this error? or am I doing something wrong?

Thanks

IAlibay commented 1 year ago

Hi @mayank-kohli, thanks for trying out OpenFE. Could you give us a bit more information about the system you are running (is this a publicly available test case we can debug?) and the hardware (are you using a GPU, etc...) & software (which version of OpenFE are you currently using)? This should help us narrow down the issue.

mayank-kohli commented 1 year ago

Hi,

Thank you for replying.

  1. I am using a publicly available protein structure.
  2. I am using a GPU (nvidia A100)
  3. OpenFe Version : 0.7.2

The Tutorial is running fine, I didn't encounter any error.

IAlibay commented 1 year ago

@mayank-kohli are the ligands being transformed something you are able to share? Otherwise probably an equally useful question - are you seeing similar issues on other test systems?

mayank-kohli commented 1 year ago

I am using the co-crystal ligand and its analogues as input. I have tried with one more system (different Protein), in that case, I didn't encounter any error.

IAlibay commented 1 year ago

@mayank-kohli - generally getting that error happens because your system has encountered a high energy state that it can't escape from. If it's only affecting the one ligand pair, then it effectively means that however that system is setup does not play well, could be; bad initial coordinates, a poor mapping between the two ligands, or some kind of deeper forcefield issue.

Is there anything in the mapping or the initial coordinates that could indicate this to be a problematic system?

mayank-kohli commented 1 year ago

The difference between the 2 ligands is not much, there is only a methyl replacement as currently I am testing the different systems I didn't gone for much.

I tried with 2 types of pdbs, one from the pdb database and the other one I took the output of energy minimisation from gromacs for that I used amber99sb ff. In both the cases I encountered same.

What ff is being used here for protein? Can we try with a different ff? If it is doable? I am not sure about that part.

mayank-kohli commented 1 year ago

I Tried one more thing. What I did was I increased the number of minimization steps to 10000. It passed the equilibration state, but after that also I am getting the warnings, as you can see below

INFO: Iteration 781/10000 INFO: ******************************************************************************** WARNING: Potential energy is NaN after 0 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 1 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 2 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 3 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 4 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 5 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 6 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 7 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 8 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 9 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 10 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 11 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 12 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 13 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 14 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 15 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 16 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... WARNING: Potential energy is NaN after 17 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... INFO: Iteration took 105.013s. INFO: Estimated completion in 1 day, 5:11:05.025108, at 2023-Apr-22-01:39:43 (consuming total wall clock time 1 day, 7:39:25.750199). INFO: ******************************************************************************** INFO: Iteration 782/10000 INFO: ******************************************************************************** INFO: Iteration took 21.832s. INFO: Estimated completion in 1 day, 5:12:56.644407, at 2023-Apr-22-01:41:57 (consuming total wall clock time 1 day, 7:41:39.202004). INFO: ******************************************************************************** INFO: Iteration 783/10000 INFO: ******************************************************************************** INFO: Iteration took 21.202s. INFO: Estimated completion in 1 day, 5:14:40.511856, at 2023-Apr-22-01:44:02 (consuming total wall clock time 1 day, 7:43:44.272383). INFO: ******************************************************************************** INFO: Iteration 784/10000

It got killed after a while with following error: ERROR: Potential energy is NaN after 20 attempts of integration with move LangevinSplittingDynamicsMove CRITICAL: Propagating replica 5 at state 5 resulted in a NaN!

IAlibay commented 1 year ago

@mayank-kohli did you manually inspect the mapping between the two ligands? Are there any odd correspondences present?

mayank-kohli commented 1 year ago

I checked it mapping there are no odd correspondences between them.

IAlibay commented 1 year ago

This is indeed very odd. I still suspect that the culprit here is possibly the initial pose or some odd force field assignment for the ligand.

Could you confirm if a conventional MD simulation of both end state molecules is stable?

Could you also ensure that the following settings entry is set? I just want to eleminate the chance that the OpenCL or CPU platforms are being used for some reason.

 settings.engine_settings.compute_platform = "CUDA"

What ff is being used here for protein?

As of openfe v0.7 the default for the protein force field is FF14SB. This is unlikely to be the issue in my opinion (unless you are using multivalent ions). The likelihood is that the issue is down to how the force field is being assigned to the ligand via OpenFF. Currently only OpenFF force fields are handled here so changing things up is unlikely to have an influence. We are in the process of updating this to allow a wider range of force field types.

Can we try with a different ff? If it is doable? I am not sure about that part.

You can set the forcefields to be applied to the protein & solvent by changing:

settings.forcefield_settings.forcefields

It expects a list of openmmforcefields xml files. Currently it is set to:

    forcefields = [
        "amber/ff14SB.xml",  # ff14SB protein force field
        "amber/tip3p_standard.xml",  # TIP3P and recommended monovalent ion parameters
        "amber/tip3p_HFE_multivalent.xml",  # for divalent ions
        "amber/phosaa10.xml",  # Handles TPO entries
    ]
IAlibay commented 1 year ago

@mayank-kohli just as a short update here we did identify an issue with the default mass scaling that seem sto be causing more integrator failures. If conventional MD simulations work fine, then it may be worth trying:

settings.forcefield_settings.hydrogen_mass = 3.0
mayank-kohli commented 1 year ago

Hi, after adding the line it is running fine.

Didn't got that error again.

Thank you, for the fix.