evaleev / libint

Libint: high-performance library for computing Gaussian integrals in quantum mechanics
Other
226 stars 95 forks source link

Wrong numerical value produced for electrostatic potential #99

Closed bruneval closed 6 years ago

bruneval commented 6 years ago

Hi, Using LIBINT for the electrostatic potential gives a wrong result when switching from version 2.3.1 to version 2.4.2. Compiling LIBINT 2.4.2 with the very same configure options as I used to use for LIBINT 2.3.1

$ ../configure --prefix=/opt/libint-2.4.2/ --enable-1body=0 --enable-eri=0 --enable-eri3=0 --enable-eri2=0 --enable-contracted-ints --with-max-am=5 --with-opt-am=2 --with-cxxgen=g++ --with-cxxgen-optflags=-O1 --with-cxx=g++ --with-cxx-optflags=-O1

gives then wrong results for electrostatic potential for some integrals. Using the intel compiler on a different machine just confirms the conclusion. I found that only the integrals for Gaussian functions with l >= 2 and located on two different centers yield the wrong values.

I assumed that I did not need to change anything in my code when switching LIBINT version. Was I wrong? Best, Fabien

evaleev commented 6 years ago

Dear Fabien:

I have done my best to reproduce, but I can't. I used hartree-fock++ to compute energy of few small molecules with 2.3.1 and master (energies matched to all digits) as well as some unit tests.

Would you mind compiling and running the following simple program? It needs to be compiled just like hartree-fock++ (i.e. with -I/path/to/eigen3 etc.).


#include <libint2.hpp>
#include <iostream>
#include <iomanip>

using namespace libint2;

int main() {

  libint2::initialize();

  auto atoms = std::vector<Atom>{ {8, 0.,0.,0.}, {8, 0.,0.,2.}, {1, 0.,-1.,-1.}, {1, 0.,1.,3.}};
  std::vector<Shell> obs{Shell{{1.0, 3.0}, {{2, true, {1.0, 0.3}}}, {{0.0, 0.0, 0.0}}},
                         Shell{{2.0, 5.0}, {{2, true, {1.0, 0.2}}}, {{1.0, 1.0, 1.0}}}};
  {
    const auto lmax = std::min(3,LIBINT2_MAX_AM_elecpot);
    auto engine = Engine(Operator::nuclear, 2, lmax);
    engine.set_params(make_point_charges(atoms));

    engine.compute(obs[0], obs[0]);
    std::cout << std::setw(20) << std::setprecision(15);
    for(int i=0; i!=25; ++i)
      std::cout << std::scientific << engine.results()[0][i] << std::endl;
    std::cout << std::endl;

    engine.compute(obs[0], obs[1]);
    for(int i=0; i!=25; ++i)
      std::cout << std::scientific << engine.results()[0][i] << std::endl;

  }

  libint2::finalize();

}

Here's the reference output:

-1.238239259091998e+01
0.000000000000000e+00
0.000000000000000e+00
-5.775996163160049e-02
0.000000000000000e+00
0.000000000000000e+00
-1.301230978657952e+01
-6.796143730068988e-02
0.000000000000000e+00
1.139389632827834e-01
0.000000000000000e+00
-6.796143730068988e-02
-1.343732979153083e+01
0.000000000000000e+00
-1.478824785355970e-02
-5.775996163160049e-02
0.000000000000000e+00
0.000000000000000e+00
-1.284475452992947e+01
0.000000000000000e+00
0.000000000000000e+00
1.139389632827834e-01
-1.478824785355970e-02
0.000000000000000e+00
-1.241040347301479e+01

-4.769186621041819e-01
-9.303619356400431e-01
-1.559058302243514e+00
-9.290824121864600e-01
-5.835786921473129e-04
-1.159266418436018e+00
-3.770080831197964e-01
9.572841308198474e-01
-8.291498398421207e-01
-1.663667687168316e+00
-2.171951144148577e+00
1.074249956874296e+00
2.128355904665372e+00
1.074590109905394e+00
-3.485163651594458e-03
-1.160865205880651e+00
-8.344173649626901e-01
9.566621490332916e-01
-3.760919234260182e-01
1.660514988916377e+00
-1.120272634615116e-03
-1.385603731947886e+00
-2.105750177166632e-03
1.380654897976564e+00
2.115041199099945e+00

The second block of integrals is <d|Vnuc|d>, with bra and ket functions on different centers. In my testing 2.3.1 and master match to 10^-14 or so.

Thanks for reporting this!

bruneval commented 6 years ago

Dear Eduard, Thank you for your quick reply. I have not managed to get to the bottom of the problem yet. But here is where I stand right now: I need first to correct my previous post. The problem actually shows up when calculating the <p|Vnucl|d> integrals with the Gaussians sitting on two different centers. When calculating the same integrals but swaping the bra and the ket, both versions yield the correct result. I understood that there is no limitation about the angular momentum ordering for the one-body terms in LIBINT (as opposed to the ERI). Am I correct? Did you change something about this in the last version?

Then, I compiled your piece of C++ code and modified it so to calculate a <p|Vnucl | d> shell pair. And the result comes out correct, irrespective to the LIBINT version! My bet is that the error/problem comes from my calling sequence to LIBINT. However I am using the old fashioned way of calling it. My code MOLGW is in Fortran, where I inserted a few C++ routines to call LIBINT. I do not know what to test next... In case you have a minute, here is my routine https://github.com/bruneval/molgw/blob/master/src/libint_onebody.cc

Thank you for your help. Fabien

evaleev commented 6 years ago

Fabien:

as of ec798e38eeedc605ad5989120b5976d07765c445 electrostatic potentials are evaluated with HRR. Unfortunately to be able to use the fused multiply add operations it is necessary to provide AB_i for leftward (ket to bra) HRR and and BA_i for rightward (bra to ket) HRR. The following modifications fix the problem:

      double AB_x = B[0] - A[0];
      double AB_y = B[1] - A[1];
      double AB_z = B[2] - A[2];
      if (amA >= amB) {
        int12->AB_x[0] = AB_x;
        int12->AB_y[0] = AB_y;
        int12->AB_z[0] = AB_z;
      }
      else {
        int12->BA_x[0] = -AB_x;
        int12->BA_y[0] = -AB_y;
        int12->BA_z[0] = -AB_z;
      }
      int12->PA_x[0] = A[0] - P[0] ;
      int12->PA_y[0] = A[1] - P[1] ;
      int12->PA_z[0] = A[2] - P[2] ;
      int12->PB_x[0] = B[0] - P[0] ;
      int12->PB_y[0] = B[1] - P[1] ;
      int12->PB_z[0] = B[2] - P[2] ;
      int12->PC_x[0] = C[0] - P[0] ;
      int12->PC_y[0] = C[1] - P[1] ;
      int12->PC_z[0] = C[2] - P[2] ;

      ab2 =  AB_x * AB_x + AB_y * AB_y + AB_z * AB_z;

      int12->_0_Overlap_0_x[0] = sqrt( M_PI / alphaP ) * cA[icontrdepthA] * cB[icontrdepthB] * pow(-1,am)
          * exp( - alphaA[icontrdepthA] * alphaB[icontrdepthB] * AB_x * AB_x / alphaP );
      int12->_0_Overlap_0_y[0] = sqrt( M_PI / alphaP )
          * exp( - alphaA[icontrdepthA] * alphaB[icontrdepthB] * AB_y * AB_y / alphaP );
      int12->_0_Overlap_0_z[0] = sqrt( M_PI / alphaP )
          * exp( - alphaA[icontrdepthA] * alphaB[icontrdepthB] * AB_z * AB_z / alphaP );

Sorry about that ... not really sure how to avoid such issues in the future (other than use Engine/high-level C++ API).

Thanks for using libint!

p.s. I noted that your geometric prefactors differ from their semantic meaning by -1 (AB should be A - B, PA = P - A). It may cause weird phase issues down the road for integrals without inversion symmetry.

bruneval commented 6 years ago

Eduard, Maybe one day, I'll be able to use high-level API from C++. But until this, I'd like to thank you for solving this issue in my part.

Thank you also for the PS note. This solves an issue that I had swept under the carpet a long time: my coding had to include an extra phase factor in order to get the correct results. Now it's all clean. Best,

Fabien

evaleev commented 6 years ago

Fabien:

The changes needed are simple ... but do introduce dependence on Eigen and C++11. Your current code for electrostatic potential ints will look something along the lines of this:

/// needs to be defined once ... should return a vector of Engines if want to multithread
Engine& libint_engine() {
  static Engine engine_(Operator::overlap, 1, 0);
  return engine_;
}

extern "C" {
void libint_elecpot(int amA, int contrdepthA , double A [] , double alphaA [], double cA [],
                    int amB, int contrdepthB , double B [] , double alphaB [], double cB [],
                    double C [],
                    double elecpotAB [] ) {

  auto& engine = libint_engine();
  engine.set(Operator::nuclear)
      .set_max_nprim(std::max(contrdepthA,contrdepthB))
      .set_max_l(std::max(amA,amB))
      .set_params( std::vector<std::pair<double, std::array<double, 3>>>{std::make_pair(1.0, std::array<double, 3>{C[0], C[1], C[2]})} );

  engine.compute(Shell(std::vector<double>(alphaA, alphaA + contrdepthA), {{amA, false, std::vector<double>(cA, cA + contrdepthA)}}, {A[0], A[1], A[2]}),
                 Shell(std::vector<double>(alphaB, alphaB + contrdepthB), {{amB, false, std::vector<double>(cB, cB + contrdepthB)}}, {B[0], B[1], B[2]}));

  auto* ints = engine.results()[0];
  assert(ints != nullptr);
  for( int i12=0; i12 < INT_NCART(amA) * INT_NCART(amB) ; ++i12 ) {
    elecpotAB[i12] = ints[i12] ;
  }
}
}

This allows also use of solid harmonic Gaussians, etc. Also avoids the inefficient malloc/free of Libint_*_t per every shell pair. Etc.

Best regards,

Ed.

bruneval commented 6 years ago

Thank you Eduard. I confess that your coding with engines is very compact and elegant. Then the Boys function calculation will be performed with your subroutines instead of mine. And I was wondering about the importance of the time spent in the Boys function over the total time dedicated to the integrals. Do you have an experience about this? Many thanks and sorry for harassing you with questions. Fabien

evaleev commented 6 years ago

No problem. Boys function evaluation dominates for low-quanta Gaussians, but for basis sets with d and higher shells recurrence starts to dominate. The Boys function engine in libint is reasonably well optimized for x86 hardware (within reason) ... if you think your Boys engine if faster I'd appreciate a pull request (or even can work myself to integrate it).

bruneval commented 6 years ago

I was thinking quite the opposite: it's likely that your Boys function implementation is more efficient than mine... I'll try to perform some tests and let you know. Thanks!