Closed rwest closed 11 years ago
From the SMILES 'CC1=CC=CO1' (or, equivalently, 'Cc1ccco1'), OpenBabel generates an invalid aromatic structure:
Trying
mol = Molecule().fromAdjacencyList('
1 C 0 {2,D} {5,S}
2 C 0 {1,D} {3,S}
3 O 0 {2,S} {4,S}
4 C 0 {3,S} {5,D} {6,S}
5 C 0 {1,S} {4,D}
6 C 0 {4,S}
')
generates the proper structure:
but then trying:
mol2 = Molecule().fromSMILES(mol.toSMILES())
gives the same atom type error.
So, perhaps there's some way for OpenBabel to recognize and fix such aromaticity problems? Adding obmol.Kekulize()
before building the Molecule object didn't seem to help and would probably lead to problems with multi-ring structures anyway.
@nickvandewiele - you're the aromaticity expert around here - what do you think?
Here (in lines 64-89) is a (very) rough modification to Molecule.fromOBMol
that seems to fix this issue. It looks insane, but it works.
Perhaps it would be a good idea here to implement a robust Kekulization method before attempting to assign atom types to the molecule.
Of course, the other alternative is to simply define Ob
and Obf
atom types...
Hmm. Aromaticity is tricky stuff! http://www.dalkescientific.com/writings/diary/archive/2007/11/30/opensmiles_and_aromaticity.html
But some way to Kekulize would probably help us. If we can off-load it to an external library like OpenBabel then great, because they'll have already put in lots of thought and debugging.
Anyone know what RMG-Java thinks of these structures? What "type" is that O atom?
ok, here are my thoughts and comments on this:
furan thermo with RMG-Java on master (commit ca. july 2012) through the ThermoDataEstimator with adjacency list gives the following thermochemistry:
-4.3 74.95 20.79 27.14 32.51 36.99 44.07 49.03 56.08
all carbons are Cds, oxygen Os. furan ring correction (DH°f(298K) 6.89 kcal/mol) is applied.
NIST webbook reports DH°f(298K) of -6.6 kcal/mol (pretty old, 1991) so 2.3 kcal off. not that bad, i'd say.
btw: is the Molecule().fromSMILES() a vital method in RMG-Py?
otherwise i'd propose a very clever patch: "do not use SMILES parsers in case of tricky aromatic-like compounds"
I think Molecule().fromSMILES()
is vital for various parts of the website.
Try 2-methyl furan: http://rmg.mit.edu/adjacencylist/CC1=CC=CO1
Gives Unable to determine atom type for atom O.