Closed TimoSommer closed 1 month ago
Hi there! thanks for using the package! You can try creating your own FunctionalGroupFactory
that you use in your own code by making a subclass of stk.FunctionalGroupFactory
. Just copy paste this in your code and there should be no need to monkeypatch stk. Lemme know if this works (I've not run it but am reasonably confident)!
fwiw, do you have a minimum reproducible example for this error? I think @andrewtarzia works a lot with transition metals has not run into the issue, could be useful for our test suite if nothing else.
import stk
class TimoFunctionalGroupFactory(stk.FunctionalGroupFactory):
def __init__(self, smarts, bonders, deleters, placers=None):
self._smarts = smarts
self._bonders = bonders
self._deleters = deleters
self._placers = bonders if placers is None else placers
def get_functional_groups(self, molecule):
for atom_ids in _get_atom_ids(self._smarts, molecule):
atoms = tuple(molecule.get_atoms(atom_ids))
yield stk.GenericFunctionalGroup(
atoms=atoms,
bonders=tuple(atoms[i] for i in self._bonders),
deleters=tuple(atoms[i] for i in self._deleters),
placers=tuple(atoms[i] for i in self._placers),
)
def _get_atom_ids(query, molecule):
rdkit_mol = molecule.to_rdkit_mol()
yield from rdkit_mol.GetSubstructMatches(
query=rdkit.MolFromSmarts(query),
)
I'm close this as solved, lemme know if I should re-open.
Sorry for the delay, I just came back to this. I have prepared a test .mol file of a complex in which this issue arises. The issue can be replicated with
import rdkit
import stk
testmol = stk.BuildingBlock.init_from_file('testmol.mol')
rdkit_mol = testmol.to_rdkit_mol()
rdkit.Chem.AllChem.SanitizeMol(rdkit_mol)
The complex fails because rdkit doesn't like Ga atoms with a valence of 4 and fails with an rdkit.Chem.rdchem.AtomValenceException
. These kind of Valence Exceptions happen with rdkit all the time when dealing with metal atoms.
About the stk code, I do see the necessity of running an rdkit sanitization before the SMARTS query. Our SMARTS querys are quite trivial though from from what I know, so I might also try to refactor it in a way in which we don't need this anymore. Thank you!
Hey @TimoSommer , I hope the fix lukas mentioned works.
I will note that for handling metals in rdkit/stk, the trick (I use trick because the "chemistry" becomes questionable here) is to treat the metal-organic bonds as "dative" bonds (bond order 9 in stk). So when I change the Ga-included bonds from 1 atom1id atom2id
to 9 atom1id atom2id
in the mol file you sent, your example code ran fine. The trick is that when dative bonds are used (in the right direction), they do not add to the number of atoms in the valence checker.
See the example Metal-Organic Cage Construction
in https://stk.readthedocs.io/en/stable/_autosummary/stk.cage.Cage.html
If you use a metal complex topology graph to build a molecule, stk handles this for free, but for cages and other topology graphs you must use the DativeReactionFactory.
Oh great, that trick seems to work well! Thanks for your help!
Hey, first of all thanks for this amazing package!
I am using it for transition metal complexes, and I have noticed one re-occuring problem where rdkit throws an error because it cannot sanitize a molecule due to issues with the transition metal, which happens with rdkit all the time. For me, outcommenting this line has so far shown to work well, but it would be nice to implement this as option into the code, since otherwise I would have to monkeypatch stk all the time, which is dangerous if versions change.
The issue is in the function
get_atom_ids()
insrc/stk/_internal/functional_group_factories/utilities.py
. In this file, it would be great if the linerdkit.SanitizeMol(rdkit_mol)
could optionally be skipped and this could be exposed in the classSmartsFunctionalGroupFactory()
so that a user can turn this off if necessary.I wonder, has this issue been seen before? I cannot believe that I am the first to discover this, since stk is used so much for metals.