greglandrum / rdkit-blog

RDKit blog
https://greglandrum.github.io/rdkit-blog/
5 stars 1 forks source link

rdkit-blog/posts/2024-05-31-scaffold-splits-and-murcko-scaffolds1 #24

Open utterances-bot opened 2 months ago

utterances-bot commented 2 months ago

RDKit blog - The problem(s) with scaffold splits, part 1

I get a bit ranty… again.

https://greglandrum.github.io/rdkit-blog/posts/2024-05-31-scaffold-splits-and-murcko-scaffolds1.html

rgasper commented 2 months ago

Thanks for this analysis Greg, very clear and concise

hucheng-aidd commented 2 months ago

Thanks Greg. Scaffold splitting is indeed tricky. And some "me too" scaffolds from the so called scaffold hopping exercises would have overlapping SAR, which makes the scaffold splitting meaningless if the goal is to train the AI/ML pKi models. I wonder if the "rare scaffold" can be utilized to train the model and cross validated with data from the "common scaffolds". Happy to have a chat if you would like: hucheng.aidd@gmail.com.

metma99 commented 1 month ago

As usual very helpful, Greg! Quick question, if I may. If you were to choose the smallest murcko scaffold for each assay, how would t results change?