Open susilehtola opened 1 year ago
This, in fact, brings up two points that require careful consideration and a likely refactor of the input interface:
Practically, these changes are easy to make, but getting the interface right (w/o replicating a ton of code) will likely be a bit challenging. N.B. #76 also requires some thought about how we specify inputs, so it probably warrants a full refactor of the interface.
As far as I understand, GauXC currently operates by evaluating the electron density on the grid from the density matrix
$$ n({\bf r}) = \sum{\mu \nu} P{\mu \nu} \chi\mu({\bf r}) \chi\nu ({\bf r}) $$
This approach gets high FLOPs, since it can be formulated with efficient intermediates: first compute the matrix multiplication $p\mu({\bf r}) = P{\mu \nu} \chi\nu({\bf r})$ and then get the electron density from $n({\bf r}) = \sum\mu p\mu ({\bf r}) \chi\mu({\bf r})$ with $N\text{AO}^2 N\text{grid}$ operations.
However, when you have a large basis set and few occupied orbitals (extreme case is Perdew-Zunger self-interaction correction where you need to evaluate Fock matrices for single occupied orbitals), you can compute the density faster from
$$ n({\bf r}) = \sum_i f_i |\psi_i ({\bf r})|^2 = \sum_i fi \left|\sum\mu C{\mu i} \chi\mu({\bf r})\right|^2. $$
Evaluating the orbitals takes $N\text{MO} N\text{AO} N\text{grid}$ effort, which is the rate determining step. We therefore save a factor of $N\text{MO}/N_\text{AO}$ operations. One gets the speedup in a dense basis set, where sparsity is not significant.