MolSSI / QCEngine

Quantum chemistry program executor and IO standardizer (QCSchema).
https://molssi.github.io/QCEngine/
BSD 3-Clause "New" or "Revised" License
162 stars 78 forks source link

ESP grid files from psi4. #434

Open bismuthadams1 opened 8 months ago

bismuthadams1 commented 8 months ago

Is your feature request related to a problem? Please describe.

I am trying to produce grid_esp and grid_properties alongside a number of one-electron properties (https://psicode.org/psi4manual/1.5.0/oeprop.html). However, I'm not sure if there is a way of telling qcengine to keep the grid_esp.dat and grid_field.dat output files. As expected, psi4 can't find the 'Fatal Error: Unable to open the grid.dat file.' I've tried to set the scratch directory as the CWD, but I believe the a temporary subdirectory is created when the compute procedure is initiated. Here is my code

from openff.toolkit import Molecule
from openff.recharge.esp import ESPSettings
from openff.recharge.grids import LatticeGridSettings
from openff.recharge.grids import GridSettingsType, GridGenerator
from qcelemental.models.procedures import OptimizationInput, QCInputSpecification
from source.conformers.conformer_gen import Conformers
from openff.units import unit
from qcelemental.models.common_models import Model
from openff.recharge.utilities.molecule import smiles_to_molecule
import qcengine
import os
import numpy as np

CWD = os.getcwd()

#generate water test molecule as openff.toolkit.Molecule
test_mol =  smiles_to_molecule('[H]O[H]')
conformer_list = Conformers.generate(test_mol, generation_type='rdkit')
conformer_list[0]
qc_mol =  test_mol.to_qcschema(conformer=0)

#Setup geometry optimisation
hf_model = Model(method="hf", basis="6-31G*")
spec = QCInputSpecification(model=hf_model, keywords={}, driver="gradient")
opt_spec = OptimizationInput(
            initial_molecule=qc_mol,
            input_specification=spec,
            keywords={"coordsys": "dlc", 
                      "program": "psi4"}                                        
        )

opt = qcengine.compute_procedure(opt_spec, "geometric")
print(f'geometry optimiztion was {opt.error}')
#return optimized molecule
optmized_mol = opt.final_molecule

#Generate grid.dat file for grid_esp and grid_field
grid_settings = LatticeGridSettings(
        type="fcc", spacing=0.5, inner_vdw_scale=1.4, outer_vdw_scale=2.0
    )

grid = GridGenerator.generate(test_mol, optmized_mol.geometry*unit.angstrom, grid_settings)

grid = grid.to(unit.angstrom).m
np.savetxt("grid.dat", grid, delimiter=" ", fmt="%16.10f")
#compute one-electron properties.
opt_input_2 = { "molecule" : optmized_mol,
              "driver" : "energy",
              "model" : {"method":"scf","basis":"6-31G*"},
              "protocols":{"wavefunction":"all","stdout":True,"native_files":"all"},
              "keywords":{"scf_properties":["GRID_ESP", "GRID_FIELD","MULLIKEN_CHARGES", "LOWDIN_CHARGES", "DIPOLE", "QUADRUPOLE", "MBIS_CHARGES"]}                               
              }

opt_2 = qcengine.compute(opt_input_2, "psi4", task_config={"scratch_directory":CWD,"scratch_messy":True})

print(opt_2.dict())

esp = (np.loadtxt("grid_esp.dat").reshape(-1, 1) * unit.hartree / unit.e)

electric_field = (np.loadtxt("grid_field.dat")* unit.hartree/ (unit.e * unit.bohr)))

What would you suggest as a solution in this case? Or does a feature not exist yet to allow reading/writing to grid.dat files in qcengine

loriab commented 8 months ago

This is the frustrating case where what you want is allowed, but not all the connections are exposed to the user, partly because it wasn't clear to what extent extra files should be in the schema when they could influence, say, return_result. That's not an issue here. So the options are to:

  1. hook up the existing extra_infiles and extra_outfiles arguments to execute to slots the user can actually fill out in AtomicInput or TaskConfig.
  2. Do exactly what you tried to do with writing the file to scratch, running job in same, passing "messy" so you ca grab the result. This works with most programs, but I hit exactly what you did about the intermediate quarantine directory that can't be overwritten.
  3. Fix up psi4 to be more schema-ready so that the contents of grid.dat can be passed in with a keyword, and grid_esp.dat is returned in native_files if present. https://github.com/psi4/psi4/blob/master/psi4/driver/schema_wrapper.py#L716 This is the right next step, but you don't want to be dealing with editing psi4 or waiting around for the next release.

Since I was thwarted by (2) and (3) isn't iterable on your side, I went with (1) and cheated by using extras. Is it within your power to tweak your copy of QCEngine? I've made some changes at https://github.com/MolSSI/QCEngine/pull/436/files#diff-d66842b418a1c03f41438563803c987c9defd1811ea0af131e285ba540c8c0d6 . With those in place, I can run a simplified version of your input and get the esp grid back in native_files. Would you want to try this out? Thanks for prodding this issue :-)

import pprint
import qcelemental
import qcengine

qc_mol =  qcelemental.models.Molecule.from_data("""
O 0.0 0.0 0.0
H 1.0 0.0 0.0
H 0.0 1.0 0.0
units au
""")

griddat = """\
0.0  0.0  0.0
1.1  1.3  1.4
"""

#compute one-electron properties.
opt_input_2 = { "molecule" : qc_mol,
              "driver" : "energy",
              "model" : {"method":"scf","basis":"6-31G*"},
              "protocols":{"wavefunction":"all","stdout":True,"native_files":"all"},
              "keywords":{"scf_properties":["GRID_ESP", "GRID_FIELD","MULLIKEN_CHARGES", "LOWDIN_CHARGES", "DIPOLE", "QUADRUPOLE", "MBIS_CHARGES"]},
              "extras": {
                "extra_infiles": {"grid.dat": griddat},
                "extra_outfiles": ["grid_esp.dat"],
              },
              }

opt_2 = qcengine.compute(opt_input_2, "psi4")

pprint.pprint(opt_2.dict(), width=200)

Then the pprint includes the grid_esp.dat file.

              'schema_version': 2,
              'symbols': array(['O', 'H', 'H'], dtype='<U1'),
              'validated': True},
 'native_files': {'grid_esp.dat': '   80.9950252294\n    0.1108045511\n',
                  'input': '{\n'
                           ' "id": null,\n'
                           ' "schema_name": "qcschema_input",\n'
                           ' "schema_version": 1,\n'
                           ' "molecule": {\n'
                           '  "schema_name": "qcschema_molecule",\n'
bismuthadams1 commented 3 months ago

Hi Loriab, I must've completely missed your reply! I will try this ASAP.