openforcefield / openff-bespokefit

Automated tools for the generation of bespoke SMIRNOFF format parameters for individual molecules.
https://docs.openforcefield.org/bespokefit
MIT License
61 stars 9 forks source link

GetMMFFPartialCharge #219

Closed GlockPL closed 1 year ago

GlockPL commented 1 year ago

Description I'm trying to find parameters for Zinc Ligand. But the problem is one of the step is to use rdkit MMFF94s filed to find charges, that field doesn't have Zn in it's atom types so it fails, is there work around this? Currently the problem seems circular. I need Zinc parameters in order to calculate Zinc parameters.

When I switch to PfizerFragmenter I come across another problem: "Failed to generate SMIRKS patterns that match both the parent and torsion fragments: (28, 29, 30, 31)"

Output:

[✓] bespoke executor launched

1. preparing the bespoke workflow                                                                                                                                                                          

[✓] 1 molecules found
[✓] fitting schemas generated

2. submitting the workflow                                                                                                                                                                                 

[✓] the following workflows were submitted
┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ID ┃ SMILES                                                                                              ┃ NAME    ┃ FILE                                       ┃
┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1  │ C=C(C)O[Zn]1Oc2c(CN3CCC[C@H]3C(O[Zn])(c3ccccc3)c3ccccc3)cc(C)cc2CN2CCC[C@H]2C(c2ccccc2)(c2ccccc2)O1 │ cd24332 │ cd24332probaaldolowa_def2tzvp_some_cut.mol │
└────┴─────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────┴────────────────────────────────────────────┘

3. running the fitting pipeline                                                                                                                                                                            

⠸ fragmenting the molecule[19:50:07] UFFTYPER: Unrecognized charge state for atom: 0
[19:50:07] UFFTYPER: Unrecognized atom type: Zn+2 (0)
[19:50:07] UFFTYPER: Unrecognized charge state for atom: 39
[19:50:07] UFFTYPER: Unrecognized atom type: Zn2+2 (39)
[x] fragmentation failed

 {"type": "AttributeError", "message": "'NoneType' object has no attribute 'GetMMFFPartialCharge'", "traceback": "Traceback (most recent call last):\n  File                                               
 \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/celery/app/trace.py\", line 451, in trace_task\n    R = retval = fun(*args, **kwargs)\n  File                                         
 \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/celery/app/trace.py\", line 734, in __protected_call__\n    return self.run(*args, **kwargs)\n  File                                  
 \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/bespokefit/executor/services/fragmenter/worker.py\", line 37, in fragment\n    fragmenter.fragment(molecule,                   
 target_bond_smarts=target_bond_smarts)\n  File \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/fragmenter/fragment.py\", line 916, in fragment\n    result =                   
 self._fragment(molecule, target_bond_smarts)\n  File \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/fragmenter/fragment.py\", line 1012, in _fragment\n    molecule =         
 assign_elf10_am1_bond_orders(\n  File \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/fragmenter/chemi.py\", line 42, in assign_elf10_am1_bond_orders\n                        
 molecule.apply_elf_conformer_selection()\n  File \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/toolkit/topology/molecule.py\", line 2342, in apply_elf_conformer_selection\n 
 toolkit_registry.call(\n  File \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/toolkit/utils/toolkit_registry.py\", line 356, in call\n    raise e\n  File                     
 \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/toolkit/utils/toolkit_registry.py\", line 352, in call\n    return method(*args, **kwargs)\n  File                             
 \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/toolkit/utils/rdkit_wrapper.py\", line 1581, in apply_elf_conformer_selection\n    self.assign_partial_charges(molecule_copy,  
 \"mmff94\")\n  File \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/toolkit/utils/rdkit_wrapper.py\", line 1221, in assign_partial_charges\n    [\n  File                      
 \"/home/glock/anaconda3/envs/bespokefit/lib/python3.9/site-packages/openff/toolkit/utils/rdkit_wrapper.py\", line 1222, in <listcomp>\n    mmff_properties.GetMMFFPartialCharge(i)\nAttributeError:       
 'NoneType' object has no attribute 'GetMMFFPartialCharge'\n"}                                                                                                                                             

outputs have been saved to mol_lig.json                                                                                                                                                                    

worker: Warm shutdown (MainProcess)

worker: Warm shutdown (MainProcess)

worker: Warm shutdown (MainProcess)

Code for PfeizerFragmenter:

from openff.bespokefit.workflows import BespokeWorkflowFactory
from openff.qcsubmit.common_structures import QCSpec

factory = BespokeWorkflowFactory(
    # Define the starting force field that will be augmented with bespoke 
    # parameters.
    initial_force_field="openff-2.0.0.offxml",
    # Change the level of theory that the reference QC data is generated at
    default_qc_specs=[
        QCSpec(
            method="gfn2xtb",
            basis=None,
            program="xtb",
            spec_name="xtb",
            spec_description="gfn2xtb",
        )
    ]
)

from openff.fragmenter.fragment import PfizerFragmenter
factory.fragmentation_engine = PfizerFragmenter()

from openff.toolkit.topology import Molecule

input_molecule = Molecule.from_file('cd24332probaaldolowa_def2tzvp_some_cut.mol')

workflow_schema = factory.optimization_schema_from_molecule(
    molecule=input_molecule
)

from openff.bespokefit.executor import BespokeExecutor, BespokeWorkerConfig, wait_until_complete

with BespokeExecutor(
    n_fragmenter_workers = 8,
    n_optimizer_workers = 8,
    n_qc_compute_workers = 8,
    qc_compute_worker_config=BespokeWorkerConfig(n_cores=1)
) as executor:
    # Submit our workflow to the executor
    task_id = executor.submit(input_schema=workflow_schema)
    # Wait until the executor is done
    output = wait_until_complete(task_id)

if output.status == "success":
    # Save the resulting force field to an OFFXML file
    output.bespoke_force_field.to_file("output-ff.offxml")
elif output.status == "errored":
    # OR the print the error message if unsuccessful
    print(output.error)

This is result of running just the fragementer:

FragmentationResult(parent_smiles='[Zn:1][O:4][C:49]([c:34]1[c:22]([H:71])[c:12]([H:61])[c:8]([H:57])[c:13]([H:62])[c:23]1[H:72])([c:35]1[c:24]([H:73])[c:14]([H:63])[c:9]([H:58])[c:15]([H:64])[c:25]1[H:74])[C@@:53]1([H:101])[N:38]([C:43]([c:32]2[c:20]([H:69])[c:31]([C:42]([H:82])([H:83])[H:84])[c:21]([H:70])[c:33]3[c:30]2[O:3][Zn:40]([O:2][C:7](=[C:6]([H:55])[H:56])[C:41]([H:79])([H:80])[H:81])[O:5][C:50]([c:36]2[c:26]([H:75])[c:16]([H:65])[c:10]([H:59])[c:17]([H:66])[c:27]2[H:76])([c:37]2[c:28]([H:77])[c:18]([H:67])[c:11]([H:60])[c:19]([H:68])[c:29]2[H:78])[C@@:54]2([H:102])[N:39]([C:44]3([H:87])[H:88])[C:46]([H:91])([H:92])[C:48]([H:95])([H:96])[C:52]2([H:99])[H:100])([H:85])[H:86])[C:45]([H:89])([H:90])[C:47]([H:93])([H:94])[C:51]1([H:97])[H:98]', fragments=[Fragment(smiles='[H][c:31]1[c:20]([H:69])[c:32]([C:43]([N:38]2[C:45]([H:89])([H:90])[C:47]([H:93])([H:94])[C:51]([H:97])([H:98])[C@@:53]2([C:49]([H])([H])[H])[H:101])([H:85])[H:86])[c:30]2[c:33]([c:21]1[H:70])[C:44]([H:87])([H:88])[N:39]1[C:46]([H:91])([H:92])[C:48]([H:95])([H:96])[C:52]([H:99])([H:100])[C@@:54]1([H:102])[C:50]([H])([H])[O:5][Zn:40]([H])[O:3]2', bond_indices=(38, 43)), Fragment(smiles='[H][c:31]1[c:20]([H:69])[c:32]([H])[c:30]2[c:33]([c:21]1[H:70])[C:44]([H:87])([H:88])[N:39]1[C:46]([H:91])([H:92])[C:48]([H:95])([H:96])[C:52]([H:99])([H:100])[C@@:54]1([H:102])[C:50]([c:36]1[c:26]([H:75])[c:16]([H:65])[c:10]([H:59])[c:17]([H:66])[c:27]1[H:76])([c:37]1[c:28]([H:77])[c:18]([H:67])[c:11]([H:60])[c:19]([H:68])[c:29]1[H:78])[O:5][Zn:40]([H])[O:3]2', bond_indices=(36, 50)), Fragment(smiles='[H][c:31]1[c:20]([H:69])[c:32]([H])[c:30]2[c:33]([c:21]1[H:70])[C:44]([H:87])([H:88])[N:39]1[C:46]([H:91])([H:92])[C:48]([H:95])([H:96])[C:52]([H:99])([H:100])[C@@:54]1([H:102])[C:50]([c:36]1[c:26]([H:75])[c:16]([H:65])[c:10]([H:59])[c:17]([H:66])[c:27]1[H:76])([c:37]1[c:28]([H:77])[c:18]([H:67])[c:11]([H:60])[c:19]([H:68])[c:29]1[H:78])[O:5][Zn:40]([H])[O:3]2', bond_indices=(37, 50)), Fragment(smiles='[H][C:43]([N:38]1[C:45]([H:89])([H:90])[C:47]([H:93])([H:94])[C:51]([H:97])([H:98])[C@:53]1([C:49]([O:4][Zn:1])([c:34]1[c:22]([H:71])[c:12]([H:61])[c:8]([H:57])[c:13]([H:62])[c:23]1[H:72])[c:35]1[c:24]([H:73])[c:14]([H:63])[c:9]([H:58])[c:15]([H:64])[c:25]1[H:74])[H:101])([H:85])[H:86]', bond_indices=(34, 49)), Fragment(smiles='[H][c:31]1[c:20]([H:69])[c:32]([C:43]([N:38]2[C:45]([H:89])([H:90])[C:47]([H:93])([H:94])[C:51]([H:97])([H:98])[C@@:53]2([C:49]([H])([H])[H])[H:101])([H:85])[H:86])[c:30]2[c:33]([c:21]1[H:70])[C:44]([H:87])([H:88])[N:39]1[C:46]([H:91])([H:92])[C:48]([H:95])([H:96])[C:52]([H:99])([H:100])[C@@:54]1([H:102])[C:50]([H])([H])[O:5][Zn:40]([H])[O:3]2', bond_indices=(32, 43)), Fragment(smiles='[H][C:43]([N:38]1[C:45]([H:89])([H:90])[C:47]([H:93])([H:94])[C:51]([H:97])([H:98])[C@:53]1([C:49]([O:4][Zn:1])([c:34]1[c:22]([H:71])[c:12]([H:61])[c:8]([H:57])[c:13]([H:62])[c:23]1[H:72])[c:35]1[c:24]([H:73])[c:14]([H:63])[c:9]([H:58])[c:15]([H:64])[c:25]1[H:74])[H:101])([H:85])[H:86]', bond_indices=(49, 53)), Fragment(smiles='[H][C:43]([N:38]1[C:45]([H:89])([H:90])[C:47]([H:93])([H:94])[C:51]([H:97])([H:98])[C@:53]1([C:49]([O:4][Zn:1])([c:34]1[c:22]([H:71])[c:12]([H:61])[c:8]([H:57])[c:13]([H:62])[c:23]1[H:72])[c:35]1[c:24]([H:73])[c:14]([H:63])[c:9]([H:58])[c:15]([H:64])[c:25]1[H:74])[H:101])([H:85])[H:86]', bond_indices=(35, 49)), Fragment(smiles='[H][c:31]1[c:20]([H:69])[c:32]([H])[c:30]2[c:33]([c:21]1[H:70])[C:44]([H:87])([H:88])[N:39]1[C:46]([H:91])([H:92])[C:48]([H:95])([H:96])[C:52]([H:99])([H:100])[C@@:54]1([H:102])[C:50]([H])([H])[O:5][Zn:40]([O:2][C:7](=[C:6]([H:55])[H:56])[C:41]([H:79])([H:80])[H:81])[O:3]2', bond_indices=(2, 40)), Fragment(smiles='[H][C:43]([N:38]1[C:45]([H:89])([H:90])[C:47]([H:93])([H:94])[C:51]([H:97])([H:98])[C@:53]1([C:49]([O:4][Zn:1])([c:34]1[c:22]([H:71])[c:12]([H:61])[c:8]([H:57])[c:13]([H:62])[c:23]1[H:72])[c:35]1[c:24]([H:73])[c:14]([H:63])[c:9]([H:58])[c:15]([H:64])[c:25]1[H:74])[H:101])([H:85])[H:86]', bond_indices=(4, 49)), Fragment(smiles='[H][c:31]1[c:20]([H:69])[c:32]([H])[c:30]2[c:33]([c:21]1[H:70])[C:44]([H:87])([H:88])[N:39]1[C:46]([H:91])([H:92])[C:48]([H:95])([H:96])[C:52]([H:99])([H:100])[C@@:54]1([H:102])[C:50]([H])([H])[O:5][Zn:40]([O:2][C:7](=[C:6]([H:55])[H:56])[C:41]([H:79])([H:80])[H:81])[O:3]2', bond_indices=(2, 7))], provenance={'creator': 'openff.fragmenter', 'version': '0.2.0', 'options': {'functional_groups': {'hydrazine': '[NX3:1][NX3:2]', 'hydrazone': '[NX3:1][NX2:2]', 'nitric_oxide': '[N:1]-[O:2]', 'amide': '[#7:1][#6:2](=[#8:3])', 'amide_n': '[#7:1][#6:2](-[O-:3])', 'amide_2': '[NX3:1][CX3:2](=[OX1:3])[NX3:4]', 'aldehyde': '[CX3H1:1](=[O:2])[#6:3]', 'sulfoxide_1': '[#16X3:1]=[OX1:2]', 'sulfoxide_2': '[#16X3+:1][OX1-:2]', 'sulfonyl': '[#16X4:1](=[OX1:2])=[OX1:3]', 'sulfinic_acid': '[#16X3:1](=[OX1:2])[OX2H,OX1H0-:3]', 'sulfinamide': '[#16X4:1](=[OX1:2])(=[OX1:3])([NX3R0:4])', 'sulfonic_acid': '[#16X4:1](=[OX1:2])(=[OX1:3])[OX2H,OX1H0-:4]', 'phosphine_oxide': '[PX4:1](=[OX1:2])([#6:3])([#6:4])([#6:5])', 'phosphonate': '[P:1](=[OX1:2])([OX2H,OX1-:3])([OX2H,OX1-:4])', 'phosphate': '[PX4:1](=[OX1:2])([#8:3])([#8:4])([#8:5])', 'carboxylic_acid': '[CX3:1](=[O:2])[OX1H0-,OX2H1:3]', 'nitro_1': '[NX3+:1](=[O:2])[O-:3]', 'nitro_2': '[NX3:1](=[O:2])=[O:3]', 'ester': '[CX3:1](=[O:2])[OX2H0:3]', 'tri_halide': '[#6:1]([F,Cl,I,Br:2])([F,Cl,I,Br:3])([F,Cl,I,Br:4])'}, 'scheme': 'Pfizer'}, 'toolkits': [('RDKitToolkitWrapper', '2022.09.1'), ('AmberToolsToolkitWrapper', '21.0'), ('BuiltInToolkitWrapper', None)]})

Software versions

j-wags commented 1 year ago

Hi @GlockPL,

Thanks for the detailed issue description.

Unfortunately right now OpenFF's domain of applicability (that is, our infrastructure and force fields) only encompasses CHONPS+halogen elements. We may handle coordinated metals in the future (and in theory our force field specification is already ready for metals), but at the moment our published FFs and infrastructure will fail at multiple points if an input has metals.

mattwthompson commented 1 year ago

You also might want to drop down to version 0.1.3 - release cannot complete the full fitting process and so we've yoinked it.

GlockPL commented 1 year ago

Ok, thank you for the info, I'll try downgrading.