theochem / horton

HORTON: Helpful Open-source Research TOol for N-fermion systems
http://theochem.github.io/horton/
GNU General Public License v3.0
92 stars 40 forks source link

Horton integrals are normalized-weirdly. #39

Closed matt-chan closed 7 years ago

matt-chan commented 8 years ago
matt-chan commented 8 years ago

@crisely09 @fhqgfss to add integral examples.

matt-chan commented 8 years ago

Okay. I've got a minimal testcase. I hope you can install PySCF + libcint. If it's not possible, you can just take a look at the integrals I've given. Psi4 gives the same numbers as PySCF, but I can't find the input file I used last time =/.

from horton import *
import numpy as np

from horton.pyscf_wrapper import gobasis

# Hartree-Fock calculation
# ------------------------

# Construct a molecule from scratch
mol = IOData(title='dinitrogen')
mol.coordinates = np.array([[0.0, 0.0, 0.0]])
mol.numbers = np.array([2])

# Create a Gaussian basis set
obasis = get_gobasis(mol.coordinates, mol.numbers, 'cc-pvdz')
obasis2 = gobasis(mol.coordinates, mol.numbers, 'cc-pvdz')

# Create a linalg factory
lf = DenseLinalgFactory(obasis.nbasis)

# Compute Gaussian integrals
olp = obasis.compute_overlap(lf)
print "="*50
print "HORTON OVERLAP"
print olp._array
print "="*50
print "PYSCF OVERLAP"
olp2 = obasis2.compute_overlap(lf)
print olp2._array
print "="*50

And the output:

================================================================================
 _ __ _
/ (..) \ Welcome to HORTON 2.0.0!
\/ || \/
 |_''_|  HORTON is written and maintained by by Toon Verstraelen (1).

         This version contains contributions from Toon Verstraelen (1),
         Katharina Boguslawski (2), Pawel Tecmer (2), Farnaz Heidar-Zadeh (2),
         Matthew Chan (2), Taewon D. Kim (2), Yilin Zhao (2), Steven
         Vandenbrande (1), Derrick Yang (2), Cristina E. González-Espinoza (2),
         Peter A. Limacher (2), Diego Berrocal (2), Ali Malek (2) and Paul W.
         Ayers (2)

         (1) Center for Molecular Modeling, Ghent University, Belgium.
         (2) The Ayers Group at McMaster University, Ontario, Canada.

         More information about HORTON can be found on this website:
         http://theochem.github.com/horton/

         The purpose of this log file is to track the progress and quality of a
         computation. Useful numerical output may be written to a checkpoint
         file and is accessible through the Python scripting interface.

================================================================================

User:           mattchan
Machine info:   Linux localhost.localdomain 4.5.5-201.fc23.x86_64 #1 SMP Sat May 21 15:29:49 UTC
                2016 x86_64
Time:           2016-06-03T19:09:07.482741
Python version: 2.7.11 (default, Mar 31 2016, 20:46:51) [GCC 5.3.1 20151207 (Red Hat 5.3.1-2)]
HORTON version: 2.0.0
Current Dir:    /home/mattchan/Documents/Backup/pycharm/horton
Command line:   ./rhf_he_dense.py
HORTON module:  /home/mattchan/Documents/Backup/pycharm/horton/horton/log.pyc

Initialized: <horton.gbasis.cext.GOBasis object at 0x7f478556d120>
  Number of basis functions         : 5
  Number of normalization constants : 7
  Maximum shell type                : 1
  Center     0 :  S3 S1 P1

==================================================
HORTON OVERLAP
[[ 0.35119416  0.37581818  0.          0.          0.        ]
 [ 0.37581818  1.          0.          0.          0.        ]
 [ 0.          0.          1.          0.          0.        ]
 [ 0.          0.          0.          1.          0.        ]
 [ 0.          0.          0.          0.          1.        ]]
==================================================
PYSCF OVERLAP
[[ 1.          0.63416773  0.          0.          0.        ]
 [ 0.63416773  1.          0.          0.          0.        ]
 [ 0.          0.          1.          0.          0.        ]
 [ 0.          0.          0.          1.          0.        ]
 [ 0.          0.          0.          0.          1.        ]]
==================================================
User:           mattchan
Machine info:   Linux localhost.localdomain 4.5.5-201.fc23.x86_64 #1 SMP Sat May 21 15:29:49 UTC
                2016 x86_64
Time:           2016-06-03T19:09:07.495777
Python version: 2.7.11 (default, Mar 31 2016, 20:46:51) [GCC 5.3.1 20151207 (Red Hat 5.3.1-2)]
HORTON version: 2.0.0
Current Dir:    /home/mattchan/Documents/Backup/pycharm/horton
Command line:   ./rhf_he_dense.py
HORTON module:  /home/mattchan/Documents/Backup/pycharm/horton/horton/log.pyc

@tovrstra Is this okay?

tovrstra commented 8 years ago

OK. That is clear. I'll check it out.

tovrstra commented 8 years ago

Thank you for the minimal example. I can reproduce the problem.

I've also tested this with Gaussian. (See attached archive gaussian.zip). When asking Gaussian to print out the basis in the log file, it gives me the following:

AO basis set in the form of general basis input (Overlap normalization):
      1 0
 S   3 1.00       0.000000000000
      0.3836000000D+02  0.4017607538D-01
      0.5770000000D+01  0.2613680748D+00
      0.1240000000D+01  0.7930712395D+00
 S   1 1.00       0.000000000000
      0.2976000000D+00  0.1000000000D+01
 P   1 1.00       0.000000000000
      0.1275000000D+01  0.1000000000D+01

while the EMSL data for He cc-pvdz are

#BASIS SET: (4s,1p) -> [2s,1p]
He    S
     38.3600000              0.0238090
      5.7700000              0.1548910
      1.2400000              0.4699870
He    S
      0.2976000              1.0000000
He    P
      1.2750000              1.0000000

Gaussian is renormalizing the contraction before it starts the calculation. I assume that is the behavior we'd like to have in HORTON as well.

The scale factor is 1.6874322895 and 1/(1.6874322895)**2 = 0.3511941634, which is the first diagonal element in your test.

The remaining question is where this renormalization should happen. The constructor of the class GOBasisContraction seems ideal. That way, this problem is solved for all basis sets that are generated from the files in data/basis. I would not change anything in the GOBasis class as that would cause troubles when reading files from programs that don't follow these normalization conventions, e.g. CP2K. I'll see if that works.

matt-chan commented 7 years ago

I hate to bring things back from the dead (especially bugs), but it looks like we haven't completely fixed this Toon =/. I don't think this is a repeat of the unfixed bug? (it was in high-basis ERIs)

I've given an example of a few more failing tests cases in the https://github.com/QuantumElephant/olli repo. The water molecule fails for a moderate sized basis set. (water at 3-21G), in overlap, kinetic, and nuclear attraction. ERIs are fine strangely enough.

Here are the shells present:

Initialized: <horton.gbasis.cext.GOBasis object at 0x7fe2a867ca10> Number of basis functions : 13 Number of normalization constants : 21 Maximum shell type : 1 Center 0 : S2 S1 Center 1 : S3 S2 P2 S1 P1 Center 2 : S2 S1

tovrstra commented 7 years ago

No problem. If there is something suspicious, we should take a closer look. I'm compiling PySCF to try this locally.

tovrstra commented 7 years ago

The normalization is the same in both programs (diagonal of overlap is one in both cases).

The difference is that HORTON and PySCF follow incompatible ordering conventions of the "SP" generalized contractions:

It is mostly a matter of taste. I find the HORTON approach more intuitive because it does not change the order w.r.t. the basis set file.

tovrstra commented 7 years ago

P.S. I would not call this a HORTON bug, so I'm closing again. Feel free to reopen if there is more going on.

tovrstra commented 7 years ago

P.P.S. There is one argument to slightly prefer the HORTON ordering of the SP basis functions: it is the same as that of codes that make use of generalized contractions to speed up the calculation of the integrals (e.g. Gaussian).

matt-chan commented 7 years ago

Hmm I see. It would be interesting to see what happens in PSI4 etc. This change would be a bit frustrating for integral module portability if each one were different.

tovrstra commented 7 years ago

Yep, we should indeed know what other codes are doing with this. I never checked this case explicitly because I did not anticipate this. I'll make a new issue...