Closed nickvandewiele closed 9 years ago
Safe conversion to RDKit molecules for SMILES and image generation is relied upon by many other tools (parts of the website, the importer script, output log generation, etc.) that even if the kinetic model from an RMG run is not in error, this is still a real error.
Is there any way to represent carbene safely in RDKit, and we just need to fix up our Molecule.toRDKitMol()
method?
At the moment it's basically just
for index, atom in enumerate(self.vertices):
rdAtom = Chem.rdchem.Atom(atom.element.symbol)
rdAtom.SetNumRadicalElectrons(atom.radicalElectrons)
rdkitmol.AddAtom(rdAtom)
rdAtomIndices[atom] = index
and makes no effort to tell RDKit about the lone pair.
At first sight, I don't see how to copy the lonePairs
attribute from an RMG atom to an RDKit molecule, which explains why this wasn't done in the first place in Molecule.toRDKitMol()
.
An overview of the RDKit Atom methods:
rdAtom.ClearProp rdAtom.GetDoubleProp rdAtom.GetIsAromatic rdAtom.GetNumImplicitHs rdAtom.GetSymbol rdAtom.IsInRing rdAtom.SetDoubleProp rdAtom.SetMonomerInfo
rdAtom.DescribeQuery rdAtom.GetExplicitValence rdAtom.GetIsotope rdAtom.GetNumRadicalElectrons rdAtom.GetTotalDegree rdAtom.IsInRingSize rdAtom.SetFormalCharge rdAtom.SetNoImplicit
rdAtom.GetAtomicNum rdAtom.GetFormalCharge rdAtom.GetMass rdAtom.GetOwningMol rdAtom.GetTotalNumHs rdAtom.Match rdAtom.SetHybridization rdAtom.SetNumExplicitHs
rdAtom.GetBonds rdAtom.GetHybridization rdAtom.GetMonomerInfo rdAtom.GetPDBResidueInfo rdAtom.GetTotalValence rdAtom.NeedsUpdatePropertyCache rdAtom.SetIntProp rdAtom.SetNumRadicalElectrons
rdAtom.GetBoolProp rdAtom.GetIdx rdAtom.GetNeighbors rdAtom.GetProp rdAtom.HasProp rdAtom.SetAtomicNum rdAtom.SetIsAromatic rdAtom.SetProp
rdAtom.GetChiralTag rdAtom.GetImplicitValence rdAtom.GetNoImplicit rdAtom.GetPropNames rdAtom.HasQuery rdAtom.SetBoolProp rdAtom.SetIsotope rdAtom.UpdatePropertyCache
rdAtom.GetDegree rdAtom.GetIntProp rdAtom.GetNumExplicitHs rdAtom.GetSmarts rdAtom.InvertChirality rdAtom.SetChiralTag rdAtom.SetMass
Since the recent molecule representation update which transformed the way we represent singlet carbene molecules, a conversion problem has arisen.
We execute the following script:
Converting the triplet carbene into identifiers goes OK:
RDKit is clever enough to handle multiple unpaired electrons.
However, according to the new scheme, the singlet carbene does not have unpaired electrons anymore. We store them in a lone pair now instead. You get this now:
Not only does the RDKit conversion go wrong, resulting in too many hydrogens for carbene (4 instead of 2), nowhere do we store the multiplicity of carbene in the augmented inchi. The latter would currently only sum the number of radical electrons. In case of singlet carbene this returns 0, and as such is not appended to the inchi string.
SMILES, and InChI identifiers are currently only used in few occasions in RMG:
Practically speaking,
The current repercussions are limited. Labels are not used to compare species. Thermo for carbene will never originate from QM.
However, any new design features that would utilize string identifiers like SMILES or InChI would suffer from our decision of using lone pairs for singlet carbenes.