isayev / ANI1_dataset

A data set of 20 million calculated off-equilibrium conformations for organic molecules
MIT License
96 stars 18 forks source link

parameters used for DFT calculations? #5

Open timaro opened 6 years ago

timaro commented 6 years ago

We've been trying to do some DFT calculations to replicate the energies of a subset of the conformers in the data files, and while we get close (with the same potential), we haven't been able to exactly replicate the numbers. For example, for the lowest-energy conformer of gdb11_s08-1279, the energy in the file is -345.269812 (hartree; rounded), and we obtain -345.269936 with a coarse grid, and -345.269864 with the finest grid.

It's also worth noting that we obtain different self-interaction energies than stated in the README. For example, we're off by at least 0.005 hartree for oxygen (-75.041 vs the stated -75.036).

Could you provide more information on the exact version of the software you used, as well as any parameters that might be causing the discrepancy?

isayev commented 6 years ago

@timaro let us check! We used mostly Gaussian G09 and probably some G16 calculations, with tight SCF criteria. If you use Jaguar or any other code is a real pain in the butt to match energies between different codes.

Jussmith01 commented 6 years ago

We computed the energies using Gaussian 09. If you are using a different package there is a very good chance your total energies will be different. This is a fact when using different implementations of a given DFT functional. However, a comparison of relative energies should be VERY close from package to package. As for the self-interaction energies you are correct. These energies were mistakenly computed with wB97xD instead of wB97x. I did extensive testing on our ML model (ANI) to determine if this hurts performance and I found that it does not. The results using the self-interaction from wb97xd and wb97x were statistically identical.