BoothGroup / Vayesta

A Python package for wave function-based quantum embedding
Apache License 2.0
32 stars 7 forks source link

SCMF test failures #91

Closed cjcscott closed 1 year ago

cjcscott commented 1 year ago

I've been getting some weird behaviour in the SCMF tests on master. I get consistent failures which obviously aren't replicated on the CI- specifically, when running test_scmf.py I get

============================================================================================= short test summary info =============================================================================================
FAILED vayesta/tests/core/scmf/test_scmf.py::SCMF_Test::test_brueckner - AttributeError: 'EWF' object has no attribute 'diis'
FAILED vayesta/tests/core/scmf/test_scmf.py::SCMF_UHF_Test::test_brueckner - AttributeError: 'UEWF' object has no attribute 'diis'
===================================================================================== 2 failed, 2 passed, 3 warnings in 3.36s =====================================================================================

These trace back to the SCMF kernel, l79 of vayesta/core/scmf/scmf.py

        diis = (self.get_diis() if self.diis else None)

What seems to be happening here is that since the SCMF kernel is called via the command (in qemb.py, with self as the Embedding class)

        self.kernel = self.with_scmf.kernel.__get__(self)

which seems to result in calling with_scmf.kernel but with self equal to the embedding object (I'm not hugely familiar with the expected behaviour here, maybe @maxnus knows?). I'm basing this off the result of running a calculation on the branch debug_scmf, which adds logging at the start of SCMF.kernel

        self.log.warning("In SCMF, self is: %s", str(type(self)))
        self.log.warning(f"self.with_scmf.diis: {self.with_scmf.diis}")
        self.log.warning(f"self.diis: {self.diis}")

and gives me the result when running any SCMF calculation of

[WARNING] |  In SCMF, self is: <class 'vayesta.ewf.ewf.EWF'>
[WARNING] |  self.with_scmf.diis: True
EWF(mf: DensityFitting-RHF object of <class 'pyscf.df.df_jk.density_fit.<locals>.DensityFitting'>, solver: 'CCSD') <vayesta.core.scmf.pdmet.PDMET_RHF object at 0x7f710d6c9310>
True
Traceback (most recent call last):
  File "/home/charlie/calc/bugfix/scmf/test.py", line 24, in <module>
    rewf.kernel()
  File "/home/charlie/code/Vayesta/vayesta/core/scmf/scmf.py", line 81, in kernel
    self.log.warning(f"self.diis: {self.diis}")
                                   ^^^^^^^^^
AttributeError: 'EWF' object has no attribute 'diis'

If this is what's going on, an obvious fix is just to ensure that all statements within the SCMF kernels correctly anticipate that self is an EWF object. However, I have no idea why this isn't showing up for me in the pDMET tests or just in general- @abhishekkhedkar09 can't reproduce this behaviour on that branch. I'm using python 3.11, which might have changed some behaviour affecting this maybe?

obackhouse commented 1 year ago

assuming it's a change to the behaviour of this:

https://github.com/BoothGroup/Vayesta/blob/be227313c17c10d183e3dc9c7a522099cc426b6a/vayesta/core/qemb/qemb.py#L1634C1-L1635

cjcscott commented 1 year ago

Sorry, I should have referred to there- that's where the expression

        self.kernel = self.with_scmf.kernel.__get__(self)

which I think is problematic is.

Exactly the same expression is used in the pDMET code, so something that was confusing me was how those tests were passing. This was answered when I noticed this https://github.com/BoothGroup/Vayesta/blob/be227313c17c10d183e3dc9c7a522099cc426b6a/vayesta/tests/core/scmf/test_scmf.py#L32C5-L35 ; the pDMET test just returns without running anything, so will always pass.

We should either remove the test entirely or fix whatever issue made this necessary in the first place; from the look of it this shortcircuit of the test was added in the same commit as the rest of the test, so I guess we've never actually been testing the pDMET functionality?

Also, I don't understand the use of __get__(self) in the first place; if we want self to be the SCMF object within SCMF.kernel then replacing the above with

self.kernel = self.with_scmf.kernel

accomplishes that while being (I would argue) much clearer, and avoids this issue in testing. If we want self within SCMF.kernel to be the Embedding object having SCMF objects as subclasses of Embedding (which could be initialised similarly and use multiple inheritance to combine with EWF, DMET etc) seems a more natural solution, though definitely harder to implement.

cjcscott commented 1 year ago

Running a few more tests I can confirm that the issue here is python 3.11 related, and that the above fixes these test failures. Our options are either to fix this behaviour, or change our requirements to not support python 3.11. Pyscf already supports 3.11 since v.2.2.1, so updating might actually be surprisingly straightforward (as shown by these being the only test failures I encountered).

The main issues from what @obackhouse has said are for pybind11/ctypes based code, but I think our main (possibly only?) such code is kao2gmo which we already have a pure python driver for. We could just change this to use the python driver if 3.11 is being used as a fix for now- if this seems reasonable after some discussion I'd be happy to implemente a quick PR doing this along with the above change to SCMF.

maxnus commented 1 year ago

So self.method = func.__get__(self) binds function func to the instance. self.method = func would just set it as an attribute, which happens to be callable. A bound method will receive the class instance implicitly in a call, whereas a callable attribute would not. In this case, however, with_scmf.kernel is already a bound method, just bound to a different class entirely - we are rebinding it to our embedding object and I am somehow astonished that the whole thing worked in the first place. I would expect exactly the error that Python 3.11 throws, but why did this not happen in previous versions?

As a side note, this whole dynamic patching implementation was just a quick and dirty way to get something working, but it's neither elegant nor robust. Embedding and SCMF both know about each other, which is a bad sign. The whole thing should probably be implemented via delegation or inheritance pattern.

cjcscott commented 1 year ago

I had wondered what on earth that code was doing, as the only reason I could see for the presence of self in the assignment was trying to rebind it from with_scmf to Embedding. Python 3.11 made a lot of changes under the hood, so I'm guessing __get__ respecting passage of the full signature of bound functions (aka requiring specification of self and passing this along) is one of them. A quick test trying __get__() in this call returns an error supporting this.

So, given that the current code is written to expect self to remain as an SCMF object and pyscf uses callable attributes the whole time I don't see an issue for now with just using self.method = func. It's better than the current situation as even if inelegant it should at least work on it's own terms.

A reimplementation at some point sounds plausibly essential if we want to support SCMF in any robust sense going forward. I have some general thoughts on this and will put up an issue for discussion after some more experimentation with code structures.

I'll put together a PR to apply this patch for now.

cjcscott commented 1 year ago

While fixing this, I checked for any similar syntaxes which might have broken and came across https://github.com/BoothGroup/Vayesta/blob/be227313c17c10d183e3dc9c7a522099cc426b6a/vayesta/core/qemb/qemb.py#L228-L235 in Embedding.__init__; do we ever use this overwrite functionality @maxnus? As it seems to be untested and, if the SCMF binding is dodgy, to also be a bit dodgy.

cjcscott commented 1 year ago

This was fixed in #99 so closing issue; a full rewrite is out of scope for now.

Given the overwrite functionality seems unused and untested I'll mention it in #95.