ncats / lychi

Layered Chemical Identifier
Apache License 2.0
14 stars 10 forks source link

Isotope Perception #8

Closed tylerperyea closed 10 years ago

tylerperyea commented 10 years ago

Compounds differing only by an isotope of an atom sometimes share nothing in their hashes -- I believe this is a bug, but may be by design.

Example 1

Iothalamic acid

Consider the various isotopically enriched iothalamic acid: isotope

None of the above share any layers of the lychi hash, as you can see from the output:

CNC(=O)C1=C([131I])C(NC(C)=O)=C(I)C(C(O)=O)=C1I MONO_SUB    9294TS2JN-NS2GK3VWU6-N69MTFGPKFU-N6U76FGFF34C
CNC(=O)C1=C(I)C(C(O)=O)=C([131I])C(NC(C)=O)=C1I MONO_SUB    MJ4GHJUB3-37XZ9WZ6P5-35BMVBQL8G6-3567XZBU869G
CNC(=O)C1=C([131I])C(C(O)=O)=C(I)C(NC(C)=O)=C1I MONO_SUB    F5QJZ4FYL-LL2D31KA4X-LXCYK4978LC-LXCJLCYPHSJZ
CNC(=O)C1=C([131I])C(C(O)=O)=C([131I])C(NC(C)=O)=C1I    DI_SUB  5SCL4MFSZ-Z71PN9SJPP-ZPWTG3CD1SR-ZPRG7AZ57V5Y
CNC(=O)C1=C([131I])C(C(O)=O)=C(I)C(NC(C)=O)=C1[131I]    DI_SUB  58CNGBXRJ-JB98H1963G-JGJP5R392U1-JG1XQX6DK7M1
CNC(=O)C1=C([131I])C(NC(C)=O)=C([131I])C(C(O)=O)=C1I    DI_SUB  PS852WNDH-HFJ4X5AUVA-HATLUZGA8KG-HAGHQ57QT37R
CNC(=O)C1=C([131I])C(C(O)=O)=C([131I])C(NC(C)=O)=C1[131I]   TRI_SUB YDGUTZDXP-PYVYW99MKH-PHL5R2UNMCM-PHMWU77U2LYV
CNC(=O)C1=C(I)C(C(O)=O)=C(I)C(NC(C)=O)=C1I  NO_SUB  D1DBNGVNG-G9T7D2UU8L-GLA8MR5PGYK-GLKRLMDQ31TX

Ideally, these would all be the same up to the very last level of the hash.

caodac commented 10 years ago

Here are the fixed hash keys:

CNC(=O)C1=C([131I])C(NC(C)=O)=C(I)C(C(O)=O)=C1I MONO_SUB    D1DBNGVNG-G9T7D2UU8L-GLXN2UBNFSF-GLFZW6WY3KZK
CNC(=O)C1=C(I)C(C(O)=O)=C([131I])C(NC(C)=O)=C1I MONO_SUB    D1DBNGVNG-G9T7D2UU8L-GLXN2UBNFSF-GLFAGRX61W8A
CNC(=O)C1=C([131I])C(C(O)=O)=C(I)C(NC(C)=O)=C1I MONO_SUB    D1DBNGVNG-G9T7D2UU8L-GLXN2UBNFSF-GLF8SAYA45KS
CNC(=O)C1=C([131I])C(C(O)=O)=C([131I])C(NC(C)=O)=C1I    DI_SUB  D1DBNGVNG-G9T7D2UU8L-GLXN2UBNFSF-GLF2MSCT17P6
CNC(=O)C1=C([131I])C(C(O)=O)=C(I)C(NC(C)=O)=C1[131I]    DI_SUB  D1DBNGVNG-G9T7D2UU8L-GLXN2UBNFSF-GLF41DADXHW9
CNC(=O)C1=C([131I])C(NC(C)=O)=C([131I])C(C(O)=O)=C1I    DI_SUB  D1DBNGVNG-G9T7D2UU8L-GLXN2UBNFSF-GLF1YR8MBFU9
CNC(=O)C1=C([131I])C(C(O)=O)=C([131I])C(NC(C)=O)=C1[131I]   TRI_SUB D1DBNGVNG-G9T7D2UU8L-GLXN2UBNFSF-GLFJPYP9U5UN
CNC(=O)C1=C(I)C(C(O)=O)=C(I)C(NC(C)=O)=C1I  NO_SUB  D1DBNGVNG-G9T7D2UU8L-GLXN2UBNFSF-GLF3JS6KB6T7
tylerperyea commented 10 years ago

Very nice!