epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
291 stars 100 forks source link

InChI molecule with amide groups is rendered incorrectly #90

Open mkviatkovskii opened 7 years ago

mkviatkovskii commented 7 years ago

Original bug report from GGA repository When L-glutamine is loaded from InChI, double bond is between the C=N instead of C=O.

from indigo import Indigo
from indigo_inchi import IndigoInchi
from indigo_renderer import IndigoRenderer

i = Indigo()
ii = IndigoInchi(i)
ir = IndigoRenderer(i)
inchi_m = ii.loadMolecule('InChI=1S/C5H10N2O3/c6-3(5(9)10)1-2-4(7)8/h3H,1-2,6H2,(H2,7,8)(H,9,10)/t3-/m0/s1')
ir.renderToFile(m, 'inchi.png')
smiles_m = i.loadMolecule(inchi_m.smiles())
ir.renderToFile(m, 'smiles.png')

smiles.png: smiles inchi.png: inchi

dan2097 commented 7 years ago

This is not actually an Indigo bug, the StdInChI intentionally does not encode which tautomer is being described so as to be a tautomer independent representation. When the InChI library produces a structure from a StdInChI, a tautomer is arbritarily but determinisitcally chosen,.. and is unfortunately the uncommon one for amides i.e. Indigo is just depicting the structure that is being given to it.

Usually it is best to use the SMILES for depictions (e.g. N[C@@H](CCC(N)=O)C(=O)O) as it precisely specifies bond orders/charges and the StdInChI for checking if two molecules are chemically identical. As another example NCC(=O)O and [NH3+]CC(=O)[O-] will give the same StdInChI, and hence be depicted identically if the StdInChI is used as input.

twall commented 3 years ago

I'm seeing something similar with benzamide. @dan2097 Is this the same issue?

Rendered via inchi: via inchi

Rendered via SMILES: via SMILES

dan2097 commented 3 years ago

@twall Yes it is, both tautomers of benzamide have the same StdInChI

twall commented 2 years ago

Is there some sort of option to give to the inchi->smiles conversion to set a preference?