ReactionMechanismGenerator / RMG-Py

Python version of the amazing Reaction Mechanism Generator (RMG).
http://reactionmechanismgenerator.github.io/RMG-Py/
Other
389 stars 227 forks source link

molecule.fromSMILES(CC1=CC=CO1) gives "Unable to determine atom type for atom O." #90

Closed rwest closed 11 years ago

rwest commented 12 years ago

Try 2-methyl furan: http://rmg.mit.edu/adjacencylist/CC1=CC=CO1

Gives Unable to determine atom type for atom O.

stroiano commented 12 years ago

From the SMILES 'CC1=CC=CO1' (or, equivalently, 'Cc1ccco1'), OpenBabel generates an invalid aromatic structure:

Trying

mol = Molecule().fromAdjacencyList('
    1 C 0 {2,D} {5,S}
    2 C 0 {1,D} {3,S}
    3 O 0 {2,S} {4,S}
    4 C 0 {3,S} {5,D} {6,S}
    5 C 0 {1,S} {4,D}
    6 C 0 {4,S}
')

generates the proper structure:

but then trying:

mol2 = Molecule().fromSMILES(mol.toSMILES())

gives the same atom type error.

So, perhaps there's some way for OpenBabel to recognize and fix such aromaticity problems? Adding obmol.Kekulize() before building the Molecule object didn't seem to help and would probably lead to problems with multi-ring structures anyway.

rwest commented 12 years ago

@nickvandewiele - you're the aromaticity expert around here - what do you think?

stroiano commented 12 years ago

Here (in lines 64-89) is a (very) rough modification to Molecule.fromOBMol that seems to fix this issue. It looks insane, but it works.

Perhaps it would be a good idea here to implement a robust Kekulization method before attempting to assign atom types to the molecule.

Of course, the other alternative is to simply define Ob and Obf atom types...

rwest commented 12 years ago

Hmm. Aromaticity is tricky stuff! http://www.dalkescientific.com/writings/diary/archive/2007/11/30/opensmiles_and_aromaticity.html

But some way to Kekulize would probably help us. If we can off-load it to an external library like OpenBabel then great, because they'll have already put in lots of thought and debugging.

Anyone know what RMG-Java thinks of these structures? What "type" is that O atom?

nickvandewiele commented 12 years ago

ok, here are my thoughts and comments on this:

nickvandewiele commented 12 years ago

furan thermo with RMG-Java on master (commit ca. july 2012) through the ThermoDataEstimator with adjacency list gives the following thermochemistry:

-4.3 74.95 20.79 27.14 32.51 36.99 44.07 49.03 56.08

all carbons are Cds, oxygen Os. furan ring correction (DH°f(298K) 6.89 kcal/mol) is applied.

NIST webbook reports DH°f(298K) of -6.6 kcal/mol (pretty old, 1991) so 2.3 kcal off. not that bad, i'd say.

nickvandewiele commented 12 years ago

btw: is the Molecule().fromSMILES() a vital method in RMG-Py?

otherwise i'd propose a very clever patch: "do not use SMILES parsers in case of tricky aromatic-like compounds"

rwest commented 12 years ago

I think Molecule().fromSMILES() is vital for various parts of the website.