theochem / cgbasis

C++ library for Gaussian basis function evaluation & integrals
https://gbasis.readthedocs.io/en/latest/index.html
GNU Lesser General Public License v3.0
8 stars 4 forks source link

Only use C++ for interfacing with Libint #33

Open matt-chan opened 5 years ago

matt-chan commented 5 years ago

The C++ part is a bit beefier than it needs to be right now, especially if we transition to using PySCF for vanilla integrals and only want to keep GBasis for the exotic integrals. It's a little difficult to make modifications to the code right now because of it.

There are two parts I think we can simplify: the one-electron integrals, and the two-electron integrals.

For the 1e ints, we can probably replace the C++ implementation entirely with Python (and little snippets of Cython). I think this shouldn't be too much slower.

For 2e ints, one way we might be able to decrease the C++ load is to only use C++ to ask libint for integrals of specific shells, 4 shells (ie <ss|ss>) at a time. It would effectively be replacing the gbw code with Python and using it for generating the integrals, and also perform the cart_to_pure in Python. It won't save us from having to write code to interface with libint when we want new kernels, but at least it we won't have to dig through that much C++ to do it. Also we could (finally) switch over to the standard sub-shell ordering everyone else is using with less C++ hacking.

An added benefit of writing code this way is that we can ask for integrals 1 set of shells at a time. This is useful for direct-SCF/CI algorithms and makes our memory issues largely moot (there still is probably some use for Cholesky/DF etc).

One (fairly obvious?) thing to note about asking libint for specific sets of shells: We should probably initialize the libint C++ object only once when the python class is instantiated. The initialization costs of C++ would be too heavy otherwise.

Another possibility to simplify the 2e integrals might be to look into the new libint C++11 interface. The documentation says you should be able to implement new operators/kernels, but I haven't seen any code yet...

tovrstra commented 5 years ago

@matt-chan Can you clarify the following please:

I'm generally in favor of moving part of the C++ code to Python, if it does not completely kill performance. We need to have an idea of the Python overhead:

If we would encounter serious degradations in performance, there may still be a need to move some loops back to C++ code. In any case, the C++ code is too beefy, but this can also be solved by simplifying it. it is definitely not my best C++.

P.S. There is no such thing as standard integral ordering. PySCF and PSI4 are consistent with wikipedia but when you start looking at other codes, there is no general agreement on ordering. Not only order varies, but also sign conventions can differ. dzz in one code can be -dzz in another code, etc. That said, I'm fine with switching to wikipedia conventions. It is the defacto standard for millenials and we need to look ahead. Another option is to edit wikipedia, of course. :p

tovrstra commented 5 years ago

@PaulWAyers @FarnazH and I have been discussing this point. Our conclusion was that we should split the project into one pure gbasis and one C++ with python wrapper cgbasis:

PaulWAyers commented 5 years ago

To elaborate, on what @tovrstra said, the idea is that cgbasis is "performant" and gbasis is "flexible". So some sacrifice of readability is acceptable (and inevitable, though of course should be avoided as much as possible) for cgbasis but readability/flexibility is imperative in gbasis and some performance compromises are inevitable (but should be avoided as much as possible, of course).