openmm / spice-dataset

A collection of QM data for training potential functions
MIT License
155 stars 9 forks source link

Charged molecules in PubChem subset and probably wrong version of psi4 #88

Closed KuzmaKhrabrov closed 1 year ago

KuzmaKhrabrov commented 1 year ago

Thank you for the awesome dataset! Could you, please, specify, how was DFT run in the case of charged molecules? For instance for key='135240259' the SMILES is CC(=O)c1c(C)[nH]n2ccc[n+]12 , which has a positively charged Nitrogen.

After trying to run psi4.energy on this molecule conformers I get the following error:

Traceback (most recent call last):
  File "/mnt/sdd/khrabrov/test_psi4/test_optimize_db.py", line 26, in <module>
    energy = psi4.energy(FUNCTIONAL_STRING, molecule=molecule)
  File "/mnt/sdd/khrabrov/anaconda3/envs/psi4_14/lib/python3.9/site-packages/psi4/driver/driver.py", line 540, in energy
    return driver_cbs._cbs_gufunc(energy, name, ptype='energy', **kwargs)
  File "/mnt/sdd/khrabrov/anaconda3/envs/psi4_14/lib/python3.9/site-packages/psi4/driver/driver_cbs.py", line 1962, in _cbs_gufunc
    ptype_value, wfn = func(method_name, return_wfn=True, molecule=molecule, **kwargs)
  File "/mnt/sdd/khrabrov/anaconda3/envs/psi4_14/lib/python3.9/site-packages/psi4/driver/driver.py", line 597, in energy
    wfn = procedures['energy'][lowername](lowername, molecule=molecule, **kwargs)
  File "/mnt/sdd/khrabrov/anaconda3/envs/psi4_14/lib/python3.9/site-packages/psi4/driver/procrouting/proc.py", line 2390, in run_scf
    scf_wfn = scf_helper(name, post_scf=False, **kwargs)
  File "/mnt/sdd/khrabrov/anaconda3/envs/psi4_14/lib/python3.9/site-packages/psi4/driver/procrouting/proc.py", line 1513, in scf_helper
    scf_wfn = scf_wavefunction_factory(name, base_wfn, core.get_option('SCF', 'REFERENCE'), **kwargs)
  File "/mnt/sdd/khrabrov/anaconda3/envs/psi4_14/lib/python3.9/site-packages/psi4/driver/procrouting/proc.py", line 1186, in scf_wavefunction_factory
    wfn = core.RHF(ref_wfn, superfunc)
RuntimeError:
Fatal Error: RHF: RHF reference is only for singlets.

Was the Unrestricted Kohn-Sham method run for such cases?

Thank you and sorry, if I am missing something.

Moreover, with psi4 1.4.1 I get an additional error: psi4.driver.p4util.exceptions.ValidationError: Energy method "wb97m-d3(bj)" is not available. Did you mean? wb97m-d3bj

peastman commented 1 year ago

This sample file shows the exact settings used.

KuzmaKhrabrov commented 1 year ago

This sample file shows the exact settings used.

Here is the sample file I generated, it gives the same Error.

memory 180 gb
psi4_io.set_default_path("/tmp")

molecule sample{
O -1.9383972883224487 -0.3324943482875824 0.1915983110666275
C -0.7970702052116394 -0.3854754865169525 -0.23744431138038635
C -0.3731110394001007 3.341222047805786 1.7387802600860596
C 0.9978048801422119 3.278031349182129 1.9703782796859741
C 0.30128148198127747 0.33942824602127075 0.4828353822231293
C 1.5993709564208984 -0.005525985732674599 0.8094029426574707
C -0.8819715976715088 2.3309483528137207 0.8672145009040833
N 2.267454147338867 1.0642337799072266 1.36270272731781
N 1.4058966636657715 2.2186310291290283 1.1726529598236084
N 0.20880404114723206 1.6683127880096436 0.6274194717407227
C -0.5656470656394958 -1.442466378211975 -1.282893419265747
C 2.118178606033325 -1.4368139505386353 0.5283620953559875
H -1.0247801542282104 4.065515518188477 2.306086540222168
H 1.5733904838562012 3.964181423187256 2.694013833999634
H -1.9362047910690308 2.071005344390869 0.5402769446372986
H 3.1573872566223145 1.1172802448272705 1.0032920837402344
H 0.5354040265083313 -1.356088638305664 -1.4606988430023193
H -1.009232521057129 -1.1683963537216187 -2.3007001876831055
H -0.9567346572875977 -2.3669962882995605 -0.8609621524810791
H 2.960803508758545 -1.6326771974563599 1.2235361337661743
H 1.4156734943389893 -2.2082016468048096 0.7009861469268799
H 2.5835976600646973 -1.4817557334899902 -0.43494096398353577
units ang
symmetry c1
noreorient
nocom
}

set {
 wcombine false
 }

gradient('wB97M-D3BJ/def2-TZVPPD')
clean()
peastman commented 1 year ago

The first line of the molecule needs to specify the charge and spin multiplicity. See https://psicode.org/psi4manual/master/tutorial.html.

KuzmaKhrabrov commented 1 year ago

The first line of the molecule needs to specify the charge and spin multiplicity. See https://psicode.org/psi4manual/master/tutorial.html.

Thank you! Is there a procedure, how these were generated for the PubChem molecules? 1 1 worked fine for that case.

peastman commented 1 year ago

This script handles submitting molecules to QCFractal. Here is the code that selects the charge and multiplicity.

https://github.com/openmm/spice-dataset/blob/d616cfaa346671705f10404eceac1bcf890a6a4d/submission/submit.py#L24-L28

KuzmaKhrabrov commented 1 year ago

Thanks a lot for the answers and the link!