Open linminhtoo opened 1 year ago
Note that the SAD guess in Psi4 is not the best one possible, as it is not symmetry aware. Such a SAD guess is available in PySCF, and it is also directly accessible from Python.
I have planned to rectify the situation in Psi4 in the future, but I have some other projects to finish before that.
i have an update, i found this open PR (which hasn't been merged since 2018) on psi4numpy
https://github.com/psi4/psi4numpy/pull/36/files
and adapted the code, and it works, though I couldn't specify the dft_functional = "WB97X-D"
parameter, but I believe it doesn't matter for the initial guess.
mol = psi4_geo
# cant specify 'dft_functional': "WB97X-D", not valid
psi4.set_options({'basis': 'def2-svp',
'scf__reference': 'rhf',
# 'scf__dft_functional': "WB97X-D",
'e_convergence': 1e-8})
# Integral generation from Psi4's MintsHelper
wfn = psi4.core.Wavefunction.build(mol, psi4.core.get_global_option('BASIS'))
# t = time.time()
mints = psi4.core.MintsHelper(wfn.basisset())
S = np.asarray(mints.ao_overlap())
# Get nbf and ndocc for closed shell molecules
nbf = S.shape[0]
ndocc = wfn.nalpha()
print('\nNumber of occupied orbitals: %d' % ndocc)
print('Number of basis functions: %d' % nbf)
# Set SAD basis sets
nbeta = wfn.nbeta()
psi4.core.prepare_options_for_module("SCF")
sad_basis_list = psi4.core.BasisSet.build(wfn.molecule(), "ORBITAL",
psi4.core.get_global_option("BASIS"), puream=wfn.basisset().has_puream(),
return_atomlist=True)
sad_fitting_list = psi4.core.BasisSet.build(wfn.molecule(), "DF_BASIS_SAD",
psi4.core.get_option("SCF", "DF_BASIS_SAD"), puream=wfn.basisset().has_puream(),
return_atomlist=True)
# Use Psi4 SADGuess object to build the SAD Guess
SAD = psi4.core.SADGuess.build_SAD(wfn.basisset(), sad_basis_list) # , ndocc, nbeta
SAD.set_atomic_fit_bases(sad_fitting_list)
SAD.compute_guess();
D = SAD.Da()
sad_guess_manual = D.to_array()
however, when I compare this sad_guess_manual
with the density matrix from the full SCF with maxiter = 0
, they are not close :(
# run full SCF but limit maxiter to 0
psi4.set_options(
{
"scf__reference": "rhf",
"scf__maxiter": 0,
"scf__fail_on_maxiter": False
}
)
energy_sad, wfn_sad = psi4.energy('scf/def2-svp', dft_functional="WB97X-D", molecule=psi4_geo, return_wfn=True)
density_mat_0iters = wfn_sad.Da().to_array()
np.isclose(sad_guess_manual, density_mat_0iters, atol=1e-5).sum() / (density_mat_0iters.shape[0] ** 2)
>> 0.08549818 # should be close to 1.00 but no :/
i think this must mean that even setting maxiter = 0
already evolves the initial guess
Note that the SAD guess in Psi4 is not the best one possible, as it is not symmetry aware. Such a SAD guess is available in PySCF, and it is also directly accessible from Python.
I have planned to rectify the situation in Psi4 in the future, but I have some other projects to finish before that.
thanks for this very useful pointer! I will then give PySCF
a try. however, my main concern with PySCF
is, after getting the SAD Guess from it, I have to reorder the rows and columns of the density matrix, so that it aligns with the ordering in psi4
? I believe these 2 programs do not have the same ordering (but I'm not certain). I need to do this as most of my workflow is centred in psi4
, and a large amount of calculations of density matrices have already been done with psi4
If you call Psi4 with maxiter=0
it builds the Fock matrix from the SAD density and diagonalizes it to give you orbitals and then builds a new density from these orbitals
If you call Psi4 with
maxiter=0
it builds the Fock matrix from the SAD density and diagonalizes it to give you orbitals and then builds a new density from these orbitals
ah that makes sense then, this would explain why the density matrices are no longer the same. i did some visualization and figured this was the case as well (it already had non-zero off-diagonal values, which don't exist in the SAD guess), thanks.
would you have any thoughts/concerns on doing the SAD guessing in PySCF
and then doing the row/col re-ordering?
for starter's, I know how psi4
orders the different magnetic quantum numbers for each angular momentum, so I need to check PySCF
's ordering. if they're different, I'll need to reorder the rows & columns. similarly, i don't know if the two programs have identical ordering of specific basis functions in each basis set (e.g. in def2-svp you have multiple sets of s
and p
orbitals for an element and there could exist different ways of ordering them)
would you have any thoughts/concerns on doing the SAD guessing in
PySCF
and then doing the row/col re-ordering?
What do you need the guesses for? PySCF can also be used to run similar calculations as Psi4.
I don't know if there are differences between the basis function conventions between Psi4 and PySCF. Unfortunately, quantum chemistry programs are not interoperable.
would you have any thoughts/concerns on doing the SAD guessing in
PySCF
and then doing the row/col re-ordering?What do you need the guesses for? PySCF can also be used to run similar calculations as Psi4.
I don't know if there are differences between the basis function conventions between Psi4 and PySCF. Unfortunately, quantum chemistry programs are not interoperable.
I'm trying to build a ML model that can predict the converged density matrix. To verify whether the model is of any value, I wish to plug the predictions into a quantum chemistry program.
The problem is that the dataset I'm using (QMugs
) has used psi4
to calculate "groundtruth" energies & density matrices at the DFT level. So, my ML model is learning to output density matrices with the ordering convention used by psi4
. If I wish to plug it into a different software, like PySCF
, I believe I'll have to do some re-ordering or transformations...
Would simply re-ordering the rows/columns not work? (my understanding was that if the basis set was identical, but just that one program uses say px py pz
vs another using pz py px
a reordering would suffice, but I'm not exactly a quantum chemistry expert...)
Similarly, I wish to compare the convergence rates of my ML model's predicted density matrices against default initial guesses, and also just look at the matrices themselves to compare how they look like (for my own understanding/analysis)
I'm trying to build a ML model that can predict the converged density matrix. To verify whether the model is of any value, I wish to plug the predictions into a quantum chemistry program.
Well, I've worked on initial guesses in J. Chem. Theory Comput. 15, 1593 (2019) and J. Chem. Phys. 152, 144105 (2020); I hope you are aware of these works, the first one being especially topical for what you want to do.
If you need quantum chemistry expertise, feel free to reach out. I honestly don't know if it would just be a question of reordering px, py, and pz, or whether there are also differences in the basis functions' normalization and phase.
If you are willing to read a bunch of Fortran code, MOKIT by Jingxiang Zou is the closest thing there is to a wavefunction converter between quantum chemistry programs, and it does support both PySCF and Psi4. So in theory you could glean all of the necessary information about the conventions used by them and write your own translator.
You could try to use MOKIT itself, but unfortunately it is still lacking direct conversion between most programs and has the bizarre limitation of having to run Gaussian first to generate an FCHK file even if you just want to convert between two different programs.
hello, I wish to build the SAD Guess independently in
psi4
without running any SCF calculation. is this possible?I came across this page in the docs: https://psicode.org/psi4manual/master/api/psi4.core.SADGuess.html?highlight=print#psi4.core.SADGuess
but I can't seem to run the function
compute_guess()
- it causes my kernel/terminal to crash.here's a code snippet:
the other alternative is to run a "dummy" SCF calculation, but set
maxiter = 0
andfail_on_maxiter = False
. but i don't know if this will give me the actual initial guess or it would have undergone some further transformations. and it is also not as fast as i'd like, it takes 7 secs on 8 threads for the above mol with 23 atoms. i'd suspect doing the full SCF calculation has a lot of overhead in setting up different parts of the SCF procedure, which I'd like to avoid. (i have millions of molecules I need to get theSADGuess
for)thank you!