isayevlab / Auto3D_pkg

Auto3D generates low-energy conformers from SMILES/SDF
MIT License
148 stars 34 forks source link

What is the unit of the AIMNet predicted energies ? #42

Closed bbaillif closed 1 year ago

bbaillif commented 1 year ago

Describe the bug I am trying to use AIMNet to optimize structure of generated conformers for some (potentially charged) molecules. Optimization is working fine but the energy values seem off. The energy is being saved in the sdf file, for example when I use the calc_spe from SPE.py to calculate energy from a small molecule, the outputs energy in eV and saved in the output file in Hartree. But eV values are around -10000, making Hartree value around -370. Is there a problem with my implementation (I am using the aimnet2nqed_pc14iall_b97m_sae.jpt version) or is the AIMNet output a different unit ?

To Reproduce Steps to reproduce the behavior: Load the given sdf (rename the file because I was unable to upload a .sdf) to_opt_AIMNET_E.txt Run it through the calc_spe function in SPE.py

Expected behavior I except to see energy in eV which is reasonable (e.g. close to 0 since I estimated my molecule to have energy around -10 kcal/mol)

System information:

LiuCMU commented 1 year ago

Hey, thank you so much for the detailed description, and sorry for the late reply!

The energy unit in the output SDF file is Hartree. 1 Hartree = 27.211386245988 eV

The term in your file is the absolute energy. For molecules, the absolute energy is always a very negative value. Did you mean relative energy?

I estimated my molecule to have energy around -10 kcal/mol

bbaillif commented 1 year ago

Thank you for your answer! The -10 kcal/mol is the energy output from the MMFF94s force field (in RDKit) on this single molecule system, so I believe it is a potential energy. Then maybe there might be something I don't understand about energy i.e. absolute vs relative/potential energy. If 1 Hartree = 627.5 kcal/mol, I should find around -0.015 Hartree for my molecule? In the Auto3D publication, the range of energies in your graphs (Figure 8) are in the range [-30;30] kcal/mol. How do you retrieve these values from the model output ?

LiuCMU commented 1 year ago

This value of -10kcal/mol is probably some relative potential energy, but I'm unsure about their reference. @isayev Could you comment on the MMFF94 output?

For our paper, the definition is $\Delta G = G_r - G_p$, where $G_r$ and $G_p$ are the absolute Gibbs free energy of the two tautomers. $\Delta G$ is the Gibbs free energy of the tautomerization reaction. The Gibbs free energy was calcualted using ASE where ANI-2xt was the calculator.

$\Delta G$ is relative energy, so the scale is samll, (i.e. few kcal/mol); $G_r$ and $G_p$ are absolute energy, the scale is - thousands kcal/mol.

In the Auto3D publication, the range of energies in your graphs (Figure 8) are in the range [-30;30] kcal/mol. How do you retrieve these values from the model output ?

bbaillif commented 1 year ago

Thanks a lot for the details, that makes sense. I was confused because the range of absolute energies I obtain with the MMFF94 potential energy computation (between -500 and 500 kcal/mol, therefore between (roughly) -1 and 1 Hartree) in RDKit is completely different from the range of absolute energies I have with AIMNet or ANI2xt (between -4000 and 0 Hartree), for a set of lowest-energy conformers from GEOM Drugs containing only CNOS atoms. I should only look at relative energies (between conformers) computed with each method, which I did, and they have the same range (between -20 and 20 kcal/mol) and appear correlated. So everything is fine, it was a misunderstanding on my side between absolute and relative energy.