Valdes-Tresanco-MS / gmx_MMPBSA

gmx_MMPBSA is a new tool based on AMBER's MMPBSA.py aiming to perform end-state free energy calculations with GROMACS files.
https://valdes-tresanco-ms.github.io/gmx_MMPBSA/
GNU General Public License v3.0
226 stars 65 forks source link

[Question]: Calculated energies are 10 x higher than found in other calculations #385

Closed wulfbear closed 1 year ago

wulfbear commented 1 year ago

My Question is...

Hi all,

I know that the energies achieved via mmpbsa calculations from different tools are not necessarily comparable between different tools. However, I was very suprised when comparing my latest results from gmx_MMPBSA with tools like g_MMPBSA and pulling mean force experiments in Gromacs: gmx_MMPBSA yielded results approximately 10 x higher than what was achieved with g_MMPBSA and appr. 8 x higher than PMF experiments suggested. I have used decomposition analysis in gmx_MMPBSA and tried GB and PB (linear and non-linear) approaches using the standard parameters except for salt concentration of 0, as there are only 3 ions in my system for charge neutralization. My simulation contains a peptide and a polymeric surfaces. At the end of the simulation, the peptide is bound to the polymeric surface. I used the final frame of the trajectory of the simulations for gmx_MMPBSA and g_MMPBSA. All other parameters used in gmx_MMPBSA were left unchanged (except salt conc. / ionic strength). I did not use APBS. Do you have any idea what could have caused these discrepancies? Or did anyone use gmx_MMPBSA for a similar analysis and got more suitable results, esp. regarding PMF experiments where the results were at least comparable in the order of magnitude?

Kind regards!

Valdes-Tresanco-MS commented 1 year ago

Can you attach the final results files for both programs? In principle, the internal energy should not be very different. By comparing it we could know the reason for the differences. It is also important to consider that gmx_MMPBSA requires centered trajectories and removes PBC, while g_mmpbsa does not (I think).

wulfbear commented 1 year ago

Hi,

Yes I will attach some example files for g_MMPBSA and gmx_MMPBSA. I centered the trajectories and removed all pbc effects before starting either of the analysis. Thank you! decomp_linear_solver_2_0_0.csv final_contrib_energy_0_0_Run_2.csv

Valdes-Tresanco-MS commented 1 year ago

The differences are notable. Can you attach the final results files (not the decomposition files, in gmx_MMPBSA it is called FINAL_RESULTS_MMPBSA.dat)?

wulfbear commented 1 year ago

Yes, thank you very much. I will attach the FINAL_RESULTS_MMPBSA.dat file as .csv here: linear_PB_solver_2_0_0.csv

Thank you for your time and assistance!

Valdes-Tresanco-MS commented 1 year ago

Please, attach the results files for both programs. Also, attach the parameter files and the gmx_MMPBSA.log

wulfbear commented 1 year ago

Hi, Here are the files for gmx_MMPBSA: gmx_MMPBSA.txt (I cannot find a .log, that's the only text document I get from running gmx_MMPBSA): gmx_MMPBSA.log

MMPBSA.in : mmpbsa_decomp_all.in.txt

And for g_mmpbsa: Results file: full_energy.csv summary_energy.csv

Log file: g_mmpbsa_log.txt

parameter file: mmpbsa_in.txt

I hope that's all you need. If there's anything else I can do to help, let me know please.

Valdes-Tresanco-MS commented 1 year ago

There is a difference between the results, but it does not necessarily mean that they are incorrect. The main differences are in the VDWAALS (-45.51 gmx_MMPBSA and -70.722 g_mmpbsa) and EPB (22.01 gmx_MMPBSA with indi = 1 and 18.254 g_mmpbsa with pdie = 2). Note that gmx_MMPBSA reports the values in kcal/mol, while g_mmpbsa is in kJ/mol, so I will convert it to kcal/mol. Although the difference is appreciable, it does not correspond to 10x, as initially reported. Let's look at the differences part by part. The most worrying difference is VDWAALS, because, in principle, both values should be similar since they depend on the force field. I am not sure which value is correct, but at least those obtained with gmx_MMPBSA using sander are reproduced with ParmEd, cpptraj, and OpenMM. The electrostatic component does not seem to be important in this case, and its contribution is low. So it can be discarded. On the other hand, the polar component of solvation (EPB) has notable differences. Considering that EPB scales almost linearly with respect to the internal dielectric constant (indi and pdie), the difference is noticeable. To compare, both calculations must be done with the same parameters. In this case, the internal dielectric constant for gmx_MMPBSA was 1.0, while for g_mmpbsa it was 2.0. Although differences between the Amber and APBS solver are expected, they should not be large, as long as the same parameters are used. For the non-polar component of solvation, you should not use inp=2 in gmx_MMPBSA, since it is optimized for the Tang, Yang & Luo radii, which is not available in this version of gmx_MMPBSA and is optimized only for protein and nucleic acids. Similarly, for comparisons, you must set the same parameters. Using inp=1 does not do the decomposition and therefore does not obtain the EDISPER value, which in this case, can be incorrect given the system characteristics. By changing this parameter, the final energy should be negative.

Some additional considerations. As I mentioned above, it is important to check the parameters you are defining since you will get different results (for example, the salt concentration is different, for g_mmpbsa nconc = 0.15, while for gmx_MMPBSA istrng = 0.0). I'm not sure what your goal is for setting the cutoff for decomposition to 50 angstroms (print_res="within 50"), but it can be problematic for analysis. I am referring to resource consumption and graphics complexity, not to the calculation. Let me know any doubts Mario S.

wulfbear commented 1 year ago

Hi, Thank you very much for the detailed explanation and anaylsis. It was indeed an oversight of mine regarding the different reporting units. I have updated the title to remove the "10 x higher" bit. However, I just noticed, esp. for the decomposition analysis the values for gmx_MMPBSA per residue are also higher than for g_mmpbsa. E.g., for ILE-2 Total energies: gmx_MMPBSA: 14.80 kcal/mol --> 61.92 kJ/mol g_MMPBSA: 0.4 kJ/mol And for LYS-2 gmx_MMPBSA: -17.00 kcal/mol --> 71.12 kJ/mol g_MMPBSA: -8.55 kJ/mol etc. I will report back on the decompositions once I have adresed the differences in my analysis setups.

Regarding the differences: I will repeat the gmx_MMPBSA calculations with respects to your mentioned differences. I will set the salt concentration to meet the g_MMPBSA salt concentration of 0.15 M, and I will also change the internal dielectric constant to match both programs. I will use inp=1. Regarding the cutoff within 50 A, I have several different binding modes of peptide on polymer and wanted to include all residues in all orientations. Is there any "incorrectness" / source of error that is involved when calculating for all residues instead of only directly involved ones? The computational cost is not a limiting factor at this point in time. Thank you again, I will report back if I find anything else of interest!

Valdes-Tresanco-MS commented 1 year ago

If there are considerable differences in the final results, it is expected that they also exist in the decomposition. We have already been asked several times why there are differences, but the truth is that we have devoted more time to the gmx_MMPBSA development than to its comparison in energy terms with other programs. However, our comparison with MMPBSA.py, which is the program from which it is derived, shows minimal differences, probably due to the topologies conversion, the trajectory evolution in both programs, etc., as we clarify in the manuscript. As I mentioned before, the most worrying and curious thing is that the values of VWAALS and EEL are so different, even though it only depends on the FF. For the comparison to be convincing, these values should be similar, as they are independent of all parameters. I have no idea why this big difference (partly because I don't know how g_mmpbsa works). Regarding the decomposition, the answer is no. You can perform the calculation with all residues, there is even a keyword for it (print_res = "all"). The difficulty is in the result visualization since a lot of graphs are generated, and these with a lot of data. Anyway, gmx_MMPBSA_ana has an option in the initial dialog that allows you to hide the residues that do not contribute up to a certain cutoff.