Closed peastman closed 2 years ago
We should save the converged wavefunction.
In the past there were problems saving the wavefunction with Psi4, but hopefully in the latest release it is fixed.
the wavefunction seems a good idea but is it doable in terms of storage?
On Wed, Sep 22, 2021 at 4:11 PM Raimondas Galvelis @.***> wrote:
We should save the converged wavefunction.
- If we have the wavefunction, we can relatively cheaply compute any additional electronic properties.
- If we decide to recompute the dataset with a higher-accuracy method, the current wavefunction could be used as an initial guess to the reduce computational cost of the higher-accuracy method.
In the past there were problems saving the wavefunction with Psi4, but hopefully in the latest release it is fixed.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openmm/qmdataset/issues/7#issuecomment-924970376, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3KUOUX73WGCHAI2NQO4QDUDHPZ3ANCNFSM5EPBWRDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Computed benzene with wB97X-D/def2-TZVPPD:
import psi4
psi4.set_memory('32 GB')
benzene = psi4.geometry("""
H 1.2194 -0.1652 2.1600
C 0.6825 -0.0924 1.2087
C -0.7075 -0.0352 1.1973
H -1.2644 -0.0630 2.1393
C -1.3898 0.0572 -0.0114
H -2.4836 0.1021 -0.0204
C -0.6824 0.0925 -1.2088
H -1.2194 0.1652 -2.1599
C 0.7075 0.0352 -1.1973
H 1.2641 0.0628 -2.1395
C 1.3899 -0.0572 0.0114
H 2.4836 -0.1022 0.0205
""")
energy, wfn = psi4.energy('wB97X-D/def2-TZVPPD', molecule=benzene, return_wfn=True)
wfn.to_file('benzene')
The wavefunction size is 8.1 MB.
I don't know for sure what QCArchive can handle, but I suspect that won't be practical. For a molecule that size, the coordinates and forces together take 288 bytes. Adding in a few other values and some metadata brings it up to around 1 KB. Storing the wavefunction increases the storage requirements by 3-4 orders of magnitude!
@jthorton and @pavankum will have to chime in with which properties are supported by QCEngine/QCFractal/QCArchive and can reasonably be captured.
Instead of the wavefunction we can save the orbital coefficients and eigenvalues, which are good enough for most properties and also to reconstruct the wavefunction. A "crude" example to restart from orbital coeffs,
import psi4
import numpy as np
psi4.set_memory('32 GB')
benzene = psi4.geometry("""
H 1.2194 -0.1652 2.1600
C 0.6825 -0.0924 1.2087
C -0.7075 -0.0352 1.1973
H -1.2644 -0.0630 2.1393
C -1.3898 0.0572 -0.0114
H -2.4836 0.1021 -0.0204
C -0.6824 0.0925 -1.2088
H -1.2194 0.1652 -2.1599
C 0.7075 0.0352 -1.1973
H 1.2641 0.0628 -2.1395
C 1.3899 -0.0572 0.0114
H 2.4836 -0.1022 0.0205
""")
energy, wfn = psi4.energy('wB97X-D/def2-TZVPPD', molecule=benzene, return_wfn=True)
alpha_orb_coeffs = wfn.Ca().np
eigen_vals = wfn.epsilon_a().np
nalpha = wfn.nalpha()
print("a and b densities same: ", wfn.same_a_b_dens())
print("a and b orbs same: ", wfn.same_a_b_orbs)
Density = np.dot(alpha_orb_coeffs[:, :nalpha], alpha_orb_coeffs[:, :nalpha].T)
print(Density == wfn.Da().np)
# Changing orbitals to orbitals read from file (here, stored in variables)
psi4.core.clean()
new_scf, new_wfn = psi4.energy('hf/def2-tzvppd', molecule=benzene, return_wfn=True)
print(new_wfn.Ca().np == wfn.Ca().np)
# since alpha and beta are similar
new_wfn.Ca().np[:] = alpha_orb_coeffs
new_wfn.epsilon_a().np[:] = eigen_vals
new_wfn.Cb().np[:] = alpha_orb_coeffs
new_wfn.epsilon_b().np[:] = eigen_vals
# writing to the scratch file that psi4 reads if scf_guess was set to READ
my_file=new_wfn.get_scratch_filename(180) + '.npy'
new_wfn.to_file(my_file)
psi4.set_options({'guess': 'read'})
energy = psi4.energy('wb97x-d/def2-TZVPPD', molecule=benzene)
May be @jthorton has a polished way to construct a new wfn object instead of replacing the orb coeffs of another energy calc. Anyways, those orbitals and eigenvalues would be on the order of 10's of kilobytes.
Some properties we would be interested in are wiberg/mayer bond indices, dipole, quadrupole moments (already listed above). ESPs can be built from orbital coefficients after we reconstruct the wavefunction.
This seems like a good compromise.
On Wed, Sep 22, 2021 at 10:45 PM Pavan Behara @.***> wrote:
Instead of the wavefunction we can save the orbital coefficients and eigenvalues, which are good enough for most properties and also to reconstruct the wavefunction. A "crude" example to restart from orbital coeffs,
import psi4 import numpy as np
psi4.set_memory('32 GB')
benzene = psi4.geometry(""" H 1.2194 -0.1652 2.1600 C 0.6825 -0.0924 1.2087 C -0.7075 -0.0352 1.1973 H -1.2644 -0.0630 2.1393 C -1.3898 0.0572 -0.0114 H -2.4836 0.1021 -0.0204 C -0.6824 0.0925 -1.2088 H -1.2194 0.1652 -2.1599 C 0.7075 0.0352 -1.1973 H 1.2641 0.0628 -2.1395 C 1.3899 -0.0572 0.0114 H 2.4836 -0.1022 0.0205 """)
energy, wfn = psi4.energy('wB97X-D/def2-TZVPPD', molecule=benzene, return_wfn=True)
alpha_orb_coeffs = wfn.Ca().np eigen_vals = wfn.epsilon_a().np nalpha = wfn.nalpha()
print("a and b densities same: ", wfn.same_a_b_dens()) print("a and b orbs same: ", wfn.same_a_b_orbs)
Density = np.dot(alpha_orb_coeffs[:, :nalpha], alpha_orb_coeffs[:, :nalpha].T) print(Density == wfn.Da().np)
Changing orbitals to orbitals read from file (here, stored in variables)
psi4.core.clean()
new_scf, new_wfn = psi4.energy('hf/def2-tzvppd', molecule=benzene, return_wfn=True) print(new_wfn.Ca().np == wfn.Ca().np)
since alpha and beta are similar
new_wfn.Ca().np[:] = alpha_orb_coeffs new_wfn.epsilon_a().np[:] = eigen_vals
new_wfn.Cb().np[:] = alpha_orb_coeffs new_wfn.epsilon_b().np[:] = eigen_vals
writing to the scratch file that psi4 reads if scf_guess was set to READ
my_file=new_wfn.get_scratch_filename(180) + '.npy' new_wfn.to_file(my_file)
psi4.set_options({'guess': 'read'}) energy = psi4.energy('wb97x-d/def2-TZVPPD', molecule=benzene)
May be @jthorton https://github.com/jthorton has a polished way to construct a new wfn object instead of replacing the orb coeffs of another energy calc. Anyways, those orbitals and eigenvalues would be on the order of 10's of kilobytes.
Some properties we would be interested in are wiberg/mayer bond indices, dipole, quadrupole moments (already listed above). ESPs can be built from orbital coefficients after we reconstruct the wavefunction.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/openmm/qmdataset/issues/7#issuecomment-925316521, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3KUOQWZ3BNOJLL2WNEI3TUDI56FANCNFSM5EPBWRDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Saving the coefficients isn't a substitute for also computing and storing useful quantities. Even if it only took 1 second to recompute them for each conformation, it would still take weeks for the entire dataset. How about including the following?
DIPOLE QUADRUPOLE WIBERG_LOWDIN_INDICES MAYER_INDICES MBIS_CHARGES
Psi4 also supports Distributed Multipole Analysis, which is another way of computing atomic charges and multipoles. I don't know how it compares to MBIS.
Closing since version 1 is now released.
What quantities do we want to compute and include in the dataset? Energies and forces are of course essential, but there are other things we could also include. A good principle is that if it's cheap to compute something, and if it might potentially be useful to someone, we might as well include it. Here is a list of quantities that Psi4 can compute: https://psicode.org/psi4manual/master/oeprop.html. Here are some to consider.