Open leeharland opened 10 years ago
I confirm the issue. The issue has been submitted to GGA forum: https://groups.google.com/forum/#!topic/indigo-bugs/3QcPVrTSKMw
Just adding this comment so that it's easier for me ( @StefanSenger) to watch this issue.
Same, no answer from GGA
@valt
from stefan senger The following two URIs relate to tautomers of Sildenafil: http://ops.rsc.org/OPS1213082 http://ops.rsc.org/OPS1794066
If I take the SMILES string CCCC1=NN(C2=C1N=C(NC2=O)C3=C(C=CC(=C3)S(=O)(=O)N4CCN(CC4)C)OCC)C for OPS1213082 and perform a Tanimoto similarity search, I noticed that the other tautomer (http://ops.rsc.org/OPS1794066) is only amongst the hits when I use a threshold of less than 0.9 since the relevance for this tautomer is 0.89, Considering that OPS1794066 is a tautomer of the query I would expect it to have a relevance of 1 (or at least very close to 1). When we calculated the Tanimoto similarity with ChemAxon the similarity index was indeed 1.0. Chemists would definitely see it as counterintuitive for tautomers to have such a low relevance.
Is there anything that can be done so that Indigo produces a similarity index for tautomers that is at least closer to what one would expect (ideally 1.0)?
Is there some documentation that explains how the fingerprints are calculated?
Just to note, that I haven't tried other tautomers. I am just assuming that the behaviour would be similar.