ncats / lychi

Layered Chemical Identifier
Apache License 2.0
14 stars 10 forks source link

Add support for keto-enol tautomerism #11

Open caodac opened 10 years ago

caodac commented 10 years ago

Handle keto-enol tautomerism like the structure described in this paper (shown below).

keto-enol

olegursu commented 10 years ago

lynchi output from file tests/standardizer_keto_enol3.sdf input: input

output: output

I am not sure why output has keto instead of enol for records 1 and 2, in ring C

tylerperyea commented 10 years ago

Oleg, do you mean they have enol instead of keto? The outputs are all enols. This is not aesthetically pleasing (or chemically predominant) but shouldn't matter for standardization to a hash, right? On Jan 27, 2014 9:10 PM, "Oleg Ursu" notifications@github.com wrote:

lynchi output from file tests/standardizer_keto_enol3.sdf input: [image: input]https://f.cloud.github.com/assets/3780620/2015655/f2c8fd06-87bf-11e3-8d04-0f35820b79b5.png

output: [image: output]https://f.cloud.github.com/assets/3780620/2015697/420d9074-87c1-11e3-9595-918d115c1d27.png

I am not sure why in output has keto instead of enol for records 1 and 2

— Reply to this email directly or view it on GitHubhttps://github.com/ncats/lychi/issues/11#issuecomment-33445148 .

olegursu commented 10 years ago

Hi Tyler,

Ring C according is suppose to have one keto and one enol, both are in keto form.

According to JOC paper Trung pointed to this is a special case where enol form is more stable than keto. I can understand the aesthetically pleasing but if both eno and keto forms get the same hash and are displayed in keto form then it is hard for me see how this special case is handled different by lychi.

tylerperyea commented 10 years ago

I think the display is something of an accident, and isn't meant to be taken seriously... its more of a debugging tool. We could change to a different canonical tautomer by rules, if we find it necessary. But as long as they all get the same hash, it shouldn't matter.

The only important difference from InChI is that these 4 equivalent structures do not receive the same InChI, but they do receive the same Lychi hash. Basically, InChI can't yet handle a keto/enol combined with a distant mobile hydrogen. Is that what you're asking about? On Jan 27, 2014 9:24 PM, "Oleg Ursu" notifications@github.com wrote:

Hi Tyler,

According to JOC paper Trung pointed to this is a special case where enol form is more stable then keto. I can understand the aesthetically pleasing but if both eno and keto forms get the same hash and are displayed in keto form then it is hard for me see how this special case is handled different by lychi.

— Reply to this email directly or view it on GitHubhttps://github.com/ncats/lychi/issues/11#issuecomment-33445751 .

olegursu commented 10 years ago

Hi Tyler,

Thank you for clarification, this is special case and I see InChI can't handle it, but because it is special case it can probably handled by adding a rule.

caodac commented 10 years ago

Sorry about the confusion. The paper is only used as a justification for supporting such long range keto-enol tautomerism. It's never our intention to use it to pick the preferred form (something which is beyond the scope of any standardizer). We do, however, have a tautomer "force field" that is used to select a preferred form; it just happens that for this particular case the enol didn't get a favorable score compared to the keto.