theochem / iodata

Python library for reading, writing, and converting computational chemistry file formats and generating input files.
https://iodata.readthedocs.io/
GNU Lesser General Public License v3.0
130 stars 46 forks source link

2-electron reduced density matrices #281

Open PaulWAyers opened 2 years ago

PaulWAyers commented 2 years ago

We should support 2-electron reduced density matrices from PySCF and PyCI.

Some (very old) work on 2-electron reduced density matrices was done by @matt-chan it seems: https://github.com/theochem/iodata/pull/23

It would be useful to be able to include 2DMs (total and spin-resolved) in IOData, however, for interfacing with CI-postprocessing methods and ChemTools DM analysis tools. In the near(ish) term it would be helpful for the intracule/extracule stuff being looked at by @Ali-Tehrani and the EOM stuff being looked at by @gabrielasd .

There is already an object for this, https://github.com/theochem/iodata/blob/234ab7df57c8a141376fe83b674211b8bbe2b757/iodata/iodata.py#L165

So the thing that is basically necessary is the ability to fill this field from PySCF and PyCI (and potentially other things). This might almost be an IOData script or it could entail more thorough support for PySCF and PyCI more generally.

It might also be nice to support GQCP, which includes density matrices: https://github.com/GQCG/GQCP/blob/3e6fecc6589d6e971c7ff1062dd1b5cbd58431b8/gqcp/include/DensityMatrix/Orbital2DM.hpp https://github.com/GQCG/GQCP/blob/54e6c178ff66716232105bb2aa8029fe4091a3bb/gqcp/include/DensityMatrix/G2DM.hpp

(@msricher might find these implementations especially interesting).

We should also be able to directly support two_rdms with MO basis. A name like two_rdms_mo would work fine.

In some sense, one thing we (may) need is to basically have wrappers for calculations that use Pythonic codes (PySCF, PyCI, GQCPy) which store the output as iodata objects.

  1. We need some new functionality for 1- and 2-electron integrals and reduced density matrices in AO and MO basis sets, plus perhaps interconversion between them. This is an update to the "core" of iodata. The MO-AO and AO-MO transformations are supported in gbasis but I would not add a gbasis dependence but just use the same (very short) code snippets.
  2. Separate: we should have some example scripts/wrappers that show how to use IOData to run PySCF, PyCI, GQCPy, Psi4Numpy, etc. etc. etc. and store/save the data as IOData objects. (Psi4Numpy a lot more boring since it already uses a form of JSON that, at least for now, we support.) This probably belongs in a separate \scripts\ folder since it is not core functionality of iodata and will break whenever these codes make API-breaking changes.
msricher commented 2 years ago

Hi, I am working on this here: https://github.com/msricher/iodata/tree/rdms_mo Do you (@PaulWAyers) have an example of the transformation between AO and MO bases?

PaulWAyers commented 2 years ago

I think there is a little code in gbasis. @gabrielasd might know where it is. There might also be code directly in Thomas's linear response code, which we ahve a fork of in Ayerslab. https://github.com/AyersLab/LinearResponse

PaulWAyers commented 12 months ago

I think that @msricher @RichRick1 and @gabrielasd have worked on this. I'm going to tentatively assign the issue to them.

msricher commented 12 months ago

So I should wrap PySCF, PyCI etc. in order to output an IOData object with the two_rdms["postscf[spin_]mo"] field filled? Where should this go?

PaulWAyers commented 12 months ago

I think that's a good strategy. I'd ask @leila-pujal and @FarnazH what they think the right approach is, since @leila-pujal is working on the PySCF <-> IOData interface, and the same strategy should be used here (and for PyCI, etc..)

@gabrielasd did you get DMs from DICE too? That could also be interesting to include.

gabrielasd commented 12 months ago

Yes, we can get them from DICE, although I'm basically using PySCF as its interface, so it would be supporting it in the same PySCF wrapper. But we could also make a wrapper for DICE alone based on their examples files.

leila-pujal commented 12 months ago

@msricher, I had the same doubt too when I was thinking about what direction to go when interfacing with PySCF for both IOData/Chemtools, because IOData is based on reading files and with PySCF you should access the object. I thought the way would be to write the wrapper in a similar to the one in Gbasis but more complete, because if I remember correctly it only reads basis information but not molecular coefficients. That's what I did for interfacing Chemtools with PySCF (https://github.com/QC-Devs/chemtools/pull/5). @msricher, I know you have extensive experience with PySCF and you may even have some code written already but let me know if you need help dealing with PySCF outside the reduced density matrices.

msricher commented 12 months ago

IOData can already store molecular orbitals. I assume this includes the coefficients (e.g. from a Hartree-Fock computation)? To store (spin-resolved) 1- and 2- RDMs, we need to store the coefficient matrix C (if not already supported), as well as the RDM arrays themselves.

RDMs are large. I would recommend storing an indicator of whether the MOs are Restricted, Unrestricted, or Generalized. Then we can, for example, with Restricted MOs, only store the (αα, ββ) 1-(spin-)RDM blocks and the (αααα, ββββ, αβαβ) 2-(spin-)RDM blocks. Similar but with less symmetry for Unrestricted, and for Generalized we need to store the entire (N, N) and (N, N, N, N) RDM arrays.

PyCI outputs restricted and generalized RDMs, so it'll be easy to implement.

I'm not sure where PySCF can output 2-RDMs... what should the wrapper do, specifically?

Please comment on my proposed method, let me know how you'd like it done specifically, and I will generate PyCI bindings. If we're happy with those I'll look into PySCF. PyCI, PySCF, DICE, other Python codes all give RDMs as numpy arrays, so is a specific wrapper for each code really necessary? I don't know what you want, so please advise.

leila-pujal commented 12 months ago

For your first and second question, I guess the wrapper should deal with IOData arguments, but I am not sure if IOData is ready to store 2-RDMs, that's what needs to be checked. For your last question, I guess we may only need one code as long as conventions for this numpy arrays are well established.

msricher commented 12 months ago

Ok. The convention I gave above isn't standard, and it seems there's no convention. Let's use my convention, then. PyCI already supports it, and everything else can be made to support it.

Restricted Spin RDM aa, bb aaaa, bbbb, abab

Unrestricted Spin RDM aa, bb aaaa, bbbb, abab, baba (not sure about this)

Generalized Spin RDM nn nnnn

Restricted/unrestricted/generalized spatial RDM nn nnnn

So arrays of this type can be put into the two_rdms["post_scf_mo"] dict. Something like: { "aa": np.ndarray, "bb": np.ndarray, "aaaa": np.ndarray, "bbbb": np.ndarray, "abab": np.ndarray}. Maybe there should also be a type field with entries ∈ {"restricted", "unrestricted", "generalized"}, unless that's already there in the MolecularOrbitals field/class instance/whatever(?).

And should the wrappers to other Python libraries go here in the IOData repo somewhere? The gbasis wrapper is in gbasis....

leila-pujal commented 12 months ago

If that convention works for you right now I don't see a problem with it. I know PySCF follows a specific convention but I don't remember which one. IOData has a kind property in the MolecularOrbitals class with types; "restricted", "unrestricted" and "generalized". Generalized is not supported by IOData at the moment. I feel for now we only need to worry about the wrapper between IOData and PySCF. As I mentioned before, I think Gbasis only deals with the mol object from PySCF, which only contains the basis and molecular information but does not deal with coefficients as you'll need the calculation object from PySCF.

msricher commented 12 months ago

OK, can we have a meeting?

leila-pujal commented 12 months ago

Yes, I just messaged you on Teams (Queen's)

gabrielasd commented 12 months ago

I'm not sure where PySCF can output 2-RDMs... what should the wrapper do, specifically? Yes, PySCF can give 2-RDMs (https://pyscf.org/user/ci.html#id3), although the spin-resolved one may not be available for all their wfn methods.