moldescriptor / molecule-descriptors-webtool

A simple, user-friendly website where users can input molecules (a string) and use RDkit, an open source Python package, to calculate various descriptors.
Apache License 2.0
0 stars 0 forks source link

Tanimoto-similarity function #3

Open kjemist opened 2 days ago

kjemist commented 2 days ago

What is this function?

Include a function to the website which compares a set of molecules with another set of molecules, giving pairwise scores, aka. Tanimoto similarity scores (0.0: 0% not similar, 1.0: 100% identical).

Function is: rdkit.DataStructs.cDataStructs.TanimotoSimilarity(). (https://www.rdkit.org/docs/source/rdkit.DataStructs.cDataStructs.html) Useful tutorial: https://www.youtube.com/watch?v=3qzZbaUzo9M

How should it look like?

Two textboxes; one for "template" SMILES-structures, and another for the set which will be compared to the the template structures. Output should be a .csv-file for each set which is compared.

e.g. template is compromosed of molecule A, B,. Comparison-set is molecule x, y, z.

output is two csv-files:

If molecule A (CCN) is compared with molecule X (CCC), Y (NCN), then the corresponding CSV-file will look like: Pair Tanimoto-score
CCN-CCC 0.8
CCN-NCN 0.4
kjemist commented 2 days ago

Check here for a viable Tanimoto script:

https://github.com/kjemist/TanimotoSimilarity