BoothGroup / Vayesta

A Python package for wave function-based quantum embedding
Apache License 2.0
32 stars 7 forks source link

Solver rewrite #66

Closed cjcscott closed 1 year ago

cjcscott commented 1 year ago

This is the long-awaited rewrite of all cluster solvers to use a single shared interface to generate all integrals within the cluster. The major change in this PR is the introduction of a new class of solvers, under the temporary folder vayesta/solver_rewrite/ but to be moved to vayesta/solver/ prior to merger, which use a new set of intermediate classes ClusterHamiltonian to generate all information about the cluster within a calculation without direct reference to any fragments.

This can take the form of either an appropriate effective pyscf meanfield, or the base integrals themselves, and in either case all information is expressed within only the active space of a given cluster, avoiding any reference to the environmental orbitals. Compared to our previous approaches, this allows much more code reuse and so much more streamlined code within new solvers themselves.

As an example, a new solver which takes as input a pyscf mean-field object could be written as

from somewhere import solverclass

class NewSolver(ClusterSolver):
     def kernel(self):
            mfclus, frozen = self.hamil.to_pyscf_mf(allow_dummy_orbs=True)
            mysolver = solverclass(mfclus, frozen=frozen)
            mysolver.kernel()
            self.wf = ... # Whatever's required to set up a wavefunction

class UNewSolver(UClusterSolver, NewSolver):
      pass

# From this point only if the approach supports coupled electron-boson clusters.
class EB_NewSolver(ClusterSolver):
     def kernel(self):
            mfclus, frozen = self.hamil.to_pyscf_mf(allow_dummy_orbs=True)
            mysolver = solverclass(mfclus, couplings = self.hamil.couplings, bos_freqs=self.hamil.bos_freqs)
            mysolver.kernel()
            self.wf = ... # Whatever's required to set up a wavefunction

class EB_UNewSolver(UClusterSolver, EB_NewSolver)
       pass

Here, the allow_dummy_orbs keyword determines whether dummy virtual orbitals can be included within the hf object to then be frozen without effect in the case that spin channels have uneven numbers of orbitals. These orbital indices are specified in the frozen return value, which is otherwise None. If this parameter is set to False, a NotImplementedError will be raised where it would be required, since the fix would be for that solver to support freezing specified orbitals.

If the one- and two-body integrals can be used directly this would become

from somewhere import solverclass

class NewSolver(ClusterSolver):
     def kernel(self):
            h1, eris = self.hamil.get_integrals(with_vext=True)

           # If this approach still requires equal numbers of orbitals in both spin channels this check can be added; it's currently automatic when using an intermediate mean-field object.
           #self.hamil.assert_equal_spin_channels() 

            mysolver = solverclass(h1, eris)
            mysolver.kernel()
            self.wf = ... # Whatever's required to set up a wavefunction

class UNewSolver(UClusterSolver, NewSolver):
      pass

...

The new solvers can be obtained via a call to self.get_solver(solver) within all subclasses of qemb.fragment. This automatically generates a Hamiltonian for the current cluster (appropriately spin (un)restricted and purely fermionic or coupled electron-boson), then obtains the required solver class. An equivalent call to self.check_solver(solver) uses similar code to identify if the current fragment solver configuration is supported, without any overhead from generating the hamiltonian, allowing us to avoid long lists specifying valid solvers on an embedding method-by-embedding method basis. Obviously, there could still be issues with support for different functionalities within ewf with different solvers, so I might need to add some more wavefunction conversions, but this is hopefully broadly reasonable.

Implementation of solver methods currently in master:

Current limitations:

  1. Using an intermediate pyscf mf object currently requires equal numbers of active alpha and beta orbitals. This could be avoided by introducing additional dummy orbitals to be frozen without affecting the calculation. This currently isn't implemented, but once added we'll have the same limitations as current methods, aka requiring the solver approach supports freezing orbitals. I'd lean towards having this before merging, to avoid feature regression, but could be persuaded otherwise. I've now pushed an update which implements this, allowing treatment of arbitrary spin symmetry-broken CAS spaces with CISD and CCSD solvers, so hopefully removing any feature regression compared to the previous implementations. I've also updated the above examples for these features.
  2. I've left coupling alone for now, but there shouldn't be any additional suprises in their implementation. I figured @maxnus would probably be best placed to set this up, but would be happy to get them working otherwise. I've implemented TCCSD in the new solvers, which I quite like as a demonstration of the benefits of the new approach. I don't know what's currently supported in terms of tailoring/coupling between CCSD fragments, as I can't see find obviously pertinent tests.
  3. We currently explicit construct and store the 4-centre eris within the active space of each cluster during calculations, which could cause memory issues for very large clusters. We've discussed previously directly compressing cderi to allow the use of RI within the cluster space, but I haven't currently explored this any further. Hopefully this isn't critical, but shout if not. DF-MP2 and DF-RCCSD are now added, matching previous functionality.

Currently all tests not caught by the caveats above pass on my machine, with those requiring functionality not supported obviously fail. As usual, we want to have all tests pass (or a good justification for why the result has changed) before merger- please excuse initial failures!

obackhouse commented 1 year ago

Also required before merging this:

cjcscott commented 1 year ago

One other thing to do

maxnus commented 1 year ago

That's a lot of changes! I know we talked about this before, just to make sure we are on the same page here, what does this rewrite solve / enable us to do?

Can we still do DF-CCSD/ MP2 in the cluster with this?

cjcscott commented 1 year ago

The rewrite structures solvers to not refer to the environmental orbitals at any point, while providing a single set of objects (ClusterHamiltonian) to generate all required integrals within the cluster automatically. This basically ensures different solvers aren't going through totally different procedures to achieve this, which makes implementing anything modifying cluster interactions much more straightforward to achieve, while also avoiding the N^3 memory cost of the environmental orbitals.

Can we still do DF-CCSD/ MP2 in the cluster with this?

3 above above was basically saying that currently no density fitting within the cluster was implemented, but courtesy of a delayed flight over the weekend I had a look and actually getting CDERIs for use with DF-MP2 solvers was straightforward. I've also implemented compression of the auxiliary basis within the cluster to reduce peak memory costs for large clusters, but since this scales as O(N{aux} N{clus}^4) I've left it optional and not used by default.

Actually constructing a full DensityFitting object within the cluster could be quite involved, so general DF support within the cluster is unlikely. For specifically DF-CCSD, given it actually just constructs all ERI components as dense blocks except for vvvv within ao2mo I'm not sure how large our benefit from density-fitting there would be, since our number of virtual orbitals shouldn't blow up as quickly as conventional calculations. However, you would know better than me on this- does that sound reasonable?

maxnus commented 1 year ago

The rewrite structures solvers to not refer to the environmental orbitals at any point, while providing a single set of objects (ClusterHamiltonian) to generate all required integrals within the cluster automatically. This basically ensures different solvers aren't going through totally different procedures to achieve this, which makes implementing anything modifying cluster interactions much more straightforward to achieve, while also avoiding the N^3 memory cost of the environmental orbitals.

Environment orbitals could also be generated just-in-time via diagonalization, if the solver requires them, and discarded after the solver completed, avoiding the storage. They are also somewhat important for the symmetry operations, where a check is performed that a given symmerty operation is not just valid wrt to the atomic coordinates but also wrt to the mean-field solutions - checking the fragment space only might get most of the cases where the mean-field symmetry is broken, but possibly not all of them?

I've also implemented compression of the auxiliary basis within the cluster to reduce peak memory costs for large clusters, but since this scales as O(N{aux} N{clus}^4) I've left it optional and not used by default.

This is nice, does this for for full (i.e. as needed for CCSD) integrals?

For specifically DF-CCSD, given it actually just constructs all ERI components as dense blocks except for vvvv within ao2mo I'm not sure how large our benefit from density-fitting there would be, since our number of virtual orbitals shouldn't blow up as quickly as conventional calculations. However, you would know better than me on this- does that sound reasonable?

I added the DF-CCSD solver in the past (note that pyscf.pbc always uses ccsd.RCCSD), since I couldn't solve the diamond cluster with ~420 orbitals otherwise. 300 or so of these orbitals were virtual, so they make up most of the memory cost of the cluster integrals, as they should, subspace calculation or not.

cjcscott commented 1 year ago

Environment orbitals could also be generated just-in-time via diagonalization, if the solver requires them, and discarded after the solver completed, avoiding the storage. They are also somewhat important for the symmetry operations, where a check is performed that a given symmerty operation is not just valid wrt to the atomic coordinates but also wrt to the mean-field solutions - checking the fragment space only might get most of the cases where the mean-field symmetry is broken, but possibly not all of them?

Actually getting rid of the environmental orbitals may not be possible, but this at least means we don't have to use them unnecessarily here. Tbh I'm mainly interested the benefit in terms of not having to reimplement screening approaches multiple times without straightforward ways to test the implementations!

This is nice, does this for for full (i.e. as needed for CCSD) integrals?

It didn't but it was a pretty straightforward adjustment to add it in so I've just pushed a commit doing just that. The tolerance is adjustable in the call for the CDERIs, but defaults to 1e-12 which shouldn't introduce noticeable error.

I added the DF-CCSD solver in the past (note that pyscf.pbc always uses ccsd.RCCSD), since I couldn't solve the diamond cluster with ~420 orbitals otherwise. 300 or so of these orbitals were virtual, so they make up most of the memory cost of the cluster integrals, as they should, subspace calculation or not.

That's fair, I could take another look at including that functionality- we could probably do a straightforward subclass of RCCSD with modifications similar to those in dfccsd.py. We basically just need to replace RCCSD.ao2mo and RCCSD._add_vvvv, which isn't too bad!

maxnus commented 1 year ago

Actually getting rid of the environmental orbitals may not be possible

What makes you think so now?

but this at least means we don't have to use them unnecessarily here

One thing we should also mention that the PySCF solvers can also work without environment orbitals - instead of CCSD(mo_coeff, mo_occ, frozen=frozen), one can use CCSD(mo_coeff[:,active], mo_occ[active], frozen=None). (this works for frozen virtual orbitals as is, but for frozen occupied orbitals, it also requires a small modification of eris.fock).

In this respect, the removal of environment orbitals does not necessarily require the introduction of a "dummy-mean field".

Tbh I'm mainly interested the benefit in terms of not having to reimplement screening approaches multiple times without straightforward ways to test the implementations!

Could you elaborate how the removal of environment orbitals or more generally this PR helps you with this?

cjcscott commented 1 year ago

What makes you think so now?

As in, they'll still be needed for all the things you point out- but could be calculated on-demand.

In this respect, the removal of environment orbitals does not necessarily require the introduction of a "dummy-mean field".

Sure, that's an one other approach to get to the same result for CCSD. But given the abundance of different ERI objects for different methods in pyscf we end up with a ton of modifications specifically for each individual solver that are difficult to test (for inevitable bugs down the line). At the same time, you can end up having to modify quantities in the full-system basis for local calculations, which is conceptually nasty.

This is specifically where this is helpful for screened interactions. Using modified dense ERIs within the cluster in this approach is straightforward, applies to all solvers directly, and avoids interacting directly with the specifics of how any one method treats the ERIs.

That's fair, I could take another look at including that functionality- we could probably do a straightforward subclass of RCCSD with modifications similar to those in dfccsd.py. We basically just need to replace RCCSD.ao2mo and RCCSD._add_vvvv, which isn't too bad!

Given previous support was only for DF-RCCSD this was actually straightforward to add. If applicable, the effective mean-field object now has with_df populated with a compressed _cderi array.

cjcscott commented 1 year ago

I've added TCCSD and dumping to HDF5 solvers, along with checking that CCSD initial guesses are functional in the new approach. This just leaves

cjcscott commented 1 year ago

I've fixed a few things leading to test failures

maxnus commented 1 year ago

As in, they'll still be needed for all the things you point out- but could be calculated on-demand.

The symmetry check via the environment orbitals could be replaced with checking the cluster orbitals + effective 1-electron Hamiltonian or Fock matrix inside the cluster space, although this is a somewhat delayed test (only once the cluster space has been build, vs on fragment creation).

It's also not completely equivalent, since, e.g. a single He atom placed 1m away from the molecule would break the symmetry as detected via environment orbitals, but not as detected via effective mean-field potential. But I guess this is fine, since a cluster with the same cluster orbitals + effective potential should still give the same result in this case, even though it's not technically a symmetry of the system.

Sure, that's an one other approach to get to the same result for CCSD. But given the abundance of different ERI objects for different methods in pyscf we end up with a ton of modifications specifically for each individual solver that are difficult to test (for inevitable bugs down the line). At the same time, you can end up having to modify quantities in the full-system basis for local calculations, which is conceptually nasty.

This is specifically where this is helpful for screened interactions. Using modified dense ERIs within the cluster in this approach is straightforward, applies to all solvers directly, and avoids interacting directly with the specifics of how any one method treats the ERIs.

This is true, but the different types of ERIs also have some justification, as different amount of ERI blocks are needed dependening on the solver (e.g. MP2 vs CCSD), integral symmetry (RHF vs UHF, real vs complex), and density-fitting ("vvvv" vs "vvL"). Frequently there are also incore and outcore versions. Does this rewrite still make sure that we use the minimum required resources?

Given previous support was only for DF-RCCSD this was actually straightforward to add. If applicable, the effective mean-field object now has with_df populated with a compressed _cderi array.

OK that's nice - how many auxiliaries/cluster orbital do we generally get with the compression?

cluster tailoring- @maxnus did I see you added tests for this to master in the last few days, so it is currently supported functionality?

We have two "tailorings" - a "TCCSD solver" and tailoring from other fragments. The second approach offers all the functionality of the former approach, but we might want to keep the former approach anyways, for it's simplicity. Both approaches are working on master.

the EWF MP2 energies work with the new approach (the full_shape=True option in make_fragment_dm2cumulant was very useful).

Does this mean you construct the full cluster 2-DM for MP2 fragments? This would limit the applicability of the DM-energy functional for MP2 fragments, as for example with 300 occupied and 700 virtual orbitals the full 2-DM would require 8 TB for storage, but the occ-vir-occ-vir part only 350 GB.

cjcscott commented 1 year ago

This is true, but the different types of ERIs also have some justification, as different amount of ERI blocks are needed dependening on the solver (e.g. MP2 vs CCSD), integral symmetry (RHF vs UHF, real vs complex), and density-fitting ("vvvv" vs "vvL"). Frequently there are also incore and outcore versions. Does this rewrite still make sure that we use the minimum required resources?

So when using density fitting you can now either request this when using a pyscf.RHF object, or obtain the cderis directly from the Hamiltonian regardless of spin symmetry. The former approach is used in the DF-CCSD implementation, while the latter is in the DF-MP2. The interface via an dummy mean-field falls back to eris if density-fitting can't be used, and these can also be grabbed directly from the Hamiltonian.

The main shortcoming of all this currently is the ability to perform calculations with ERIs outcore (with the important exception of DF-CCSD, which can generate outcore eris from provided cderis).

OK that's nice - how many auxiliaries/cluster orbital do we generally get with the compression?

I haven't done any systematic testing, but it's system-dependent whether you get a considerable reduction beyond the dimension of the product space spanned by the original cderis- but even at this limit it can be a fair reduction.

We have two "tailorings" - a "TCCSD solver" and tailoring from other fragments. The second approach offers all the functionality of the former approach, but we might want to keep the former approach anyways, for it's simplicity. Both approaches are working on master.

Yeah, the TCCSD is implemented and I can have a look at the tailoring from other fragments.

Does this mean you construct the full cluster 2-DM for MP2 fragments? This would limit the applicability of the DM-energy functional for MP2 fragments, as for example with 300 occupied and 700 virtual orbitals the full 2-DM would require 8 TB for storage, but the occ-vir-occ-vir part only 350 GB.

That's the current workaround just to show we're getting the correct results; moving to using the hamiltonian itself to get the ERIs for a cluster calculation will get rid of this.