Errors when running UHF AFM states with density fitting and minimal fragments

cjcscott commented 1 year ago

Hello! This is an issue for discussion of the test failures in #69; I didn't want to confuse the discussion there so thought a different place to discuss would be useful. I was going to add a PR for discussion and a fix, but I'm currently unsure of the best way to nicely fix this.

The test failures are interesting; it's the test from the spin-symmetry-broken UHF fix I merged a while back but it didn't come up till now due to the tests not running for all PRs at that point. I can't reproduce it on my local or zombie python installs (aka all tests pass for me), both of which used the setup.py for all dependencies and seem to be within what we support, but @abhishekkhedkar09 has successfully reproduced it.

It looks to be an error coming from within the pyscf UCCSD ao2mo functionality when called within a UCISD calculation to generate an initial UFCI guess. The explicit error from the logs is

>           Lvv[blk] = lib.pack_tril(Lpq[:,va,va].reshape(-1,nvira,nvira))
E           ValueError: cannot reshape array of size 0 into shape (0,0)

../../../.local/lib/python3.7/site-packages/pyscf/cc/uccsd.py:1004: ValueError

and the relevant end of the stack trace is

vayesta/solver/fci.py:256: in get_cisd_init_guess
    return super().get_cisd_init_guess()
vayesta/solver/fci.py:102: in get_cisd_init_guess
    cisd.kernel()
vayesta/solver/cisd.py:22: in kernel
    if eris is None: eris = self.get_eris()
vayesta/solver/ccsd.py:91: in get_eris
    self.eris = self.base.get_eris_object(self.solver)
vayesta/core/util.py:439: in wrapped
    res = func(self, *args, **kwargs)
vayesta/core/qemb/uqemb.py:196: in get_eris_object
    eris = postscf_ao2mo(postscf, fock=fock, mo_energy=mo_energy, e_hf=e_hf)
vayesta/core/ao2mo/postscf_ao2mo.py:69: in postscf_ao2mo
    eris = postscf.ao2mo(mo_coeff)
../../../.local/lib/python3.7/site-packages/pyscf/ci/ucisd.py:947: in ao2mo
    return uccsd._make_df_eris_outcore(self, mo_coeff)

This looks to arise when inferring one shape dimension when another is specified to be zero size for an array of zero size, which makes sense since the obvious approach to calculate this would result in zero divided by zero and that index of the shape doesn't actually change the size of resulting matrix. As a minimal example consider the following:

In [1]: import numpy as np

In [2]: t = np.zeros((0,0))

In [3]: t.reshape(0,0)
Out[3]: array([], shape=(0, 0), dtype=float64)

In [4]: t.reshape(0,10)
Out[4]: array([], shape=(0, 10), dtype=float64)

In [5]: t.reshape(-1,10)
Out[5]: array([], shape=(0, 10), dtype=float64)

In [6]: t.reshape(-1,0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-133516b31434> in <module>
----> 1 t.reshape(-1,0)

ValueError: cannot reshape array of size 0 into shape (0)

The neatest solution would be changing the call in pyscf to > Lvv[blk] = lib.pack_tril(Lpq[:,va,va].reshape(Lpq.shape[0],nvira,nvira)) but given this is very edge case I don't know if that's likely to be accepted. Otherwise I guess catching this error in this edge case might be an OK modification? However, I wanted to discuss with others first.

My initial guess on why I can't reproduce it is that I have a .pyscf_conv.py set up to set a very large maximum memory, which might mean that only incore eris are used and so if the issue is within uccsd._make_df_eris_outcore I don't ever encounter it. However, initial tests with Abhi suggest he still gets failure with this modification so who knows.

Anyway, any help on what a nice fix would look like would be appreciated!

obackhouse commented 1 year ago

-1 in a reshape is kinda lazy programming that everyone is guilty of, is there a use case outside of embedding for UCCSD with zero alpha virtual orbitals? If so then a PR to PySCF would be advisable, otherwise I think the best fix would be to not allow out-of-core DF when there are zero virtuals in one of the spin channels

obackhouse commented 1 year ago

and the bug will be platform dependent because switching to the out-of-core depends on current memory usage on your machine as well as mol.max_memory

maxnus commented 1 year ago

I stumbled upon this before ... in my opinion, this could even be fixed in NumPy, by deriving -1 in 0-sized arrays as follows: 1) Take a subarray by removing all 0-sized axes 2) Get a sub-target-shape by removing all 0's from the target shape 3) Determine the extent of the -1 axis from the size of the subarray and the sub-target-shape in the usual way for non-0-sized arrays.

Finally, if all axes are 0-sized, take the -1-axis to be 0-sized as well.

This would extend the scope of the -1 notation. But unlikely this would get accepted quickly, so probably make a PR to PySCF :grimacing:

cjcscott commented 1 year ago

-1 in a reshape is kinda lazy programming that everyone is guilty of

Definitely- I've used it in a few places where I hadn't considered the possibility of an error like this, which is why I was slightly surprised by it!

is there a use case outside of embedding for UCCSD with zero alpha virtual orbitals?

Hypothetically you can specify an active space for a UCCSD calculation which freezes all occupied or virtual orbitals in the system, and I don't think there's an explicit disclaimer that this won't work, but it feels a little contrived. Also, there are further issues that crop up after fixing this initial error in pyscf, so I think you would just have to ensure that all such integrals are computed incore.

For now, I've settled for removing the initial guess from the test- PR just about to be up.

cjcscott commented 1 year ago

After a few iterations and other issues brought up by different approaches to this issue, I've moved to just using explicit ERIs for this test since the issues raised by this are specific to the density fitted ERI routines. I'm going to leave this issue up unless anyone disagrees since this will come up again if we try to run large-scale Hubbard models with density fitting and single-site fragments; the obvious solution there is to just use two-site impurities.

BoothGroup / Vayesta

Errors when running UHF AFM states with density fitting and minimal fragments #83