UnixJunkie / molenc

MolEnc: a molecular encoder using rdkit and OCaml.
BSD 3-Clause "New" or "Revised" License
18 stars 2 forks source link

support some form of atom scan for some ligands #29

Closed UnixJunkie closed 5 years ago

UnixJunkie commented 5 years ago

E.g. the sodium scan from Sheridan.

Sheridan, R. P. (2019). Interpretation of QSAR Models By Coloring Atoms According to Changes in Predicted Activity: How Robust Is It?. Journal of chemical information and modeling.

UnixJunkie commented 5 years ago

the '*' unknown atom symbol from the Open SMILES specification looks like the perfect candidate

this tool should be a one smile string to several smiles string filter. Probably using only rdkit.

UnixJunkie commented 5 years ago

there is a depth parameter: default is one; i.e. scan one atom at a time. But, if depth> 1: one atom plus all its neighbors up to given depth are scanned.

UnixJunkie commented 5 years ago

Note that this is an internal scan of the molecule (trying to simplify it). Note that an external scan is also possible (looking for potential improvements to the molecule by putting functional groups at key places, e.g. methyl scan of the molecule).

UnixJunkie commented 5 years ago

this would be pretty cool and useful

UnixJunkie commented 5 years ago

this needs a specific script also, note that an rdkit script should do the one-atom-at-a-time editing of the input molecule

UnixJunkie commented 5 years ago

maybe one ring at a time and one linker at a time, and one side chain at a time are also useful modes.

UnixJunkie commented 5 years ago

A simple tool to generate adequate smiles for this is in bin/molenc_scan.py.