Predicting compound activity from phenotypic profiles and chemical structures

https://doi.org/10.1101/2020.12.15.422887

Recent advances in deep learning enable using chemical structures and phenotypic profiles to accurately predict assay results for compounds virtually, reducing the time and cost of screens in the drug-discovery process. We evaluate the relative strength of three high-throughput data sources—chemical structures, images (Cell Painting), and gene-expression profiles (L1000)—to predict compound activity using a sparse historical collection of 16,170 compounds tested in 270 assays for a total of 585,439 readouts. All three data modalities can predict compound activity with high accuracy in 6-10% of assays tested; replacing million-compound physical screens with computationally prioritized smaller screens throughout the pharmaceutical industry could yield major savings. Furthermore, the three profiling modalities are complementary, and in combination they can predict 21% of assays with high accuracy, and 64% if lower accuracy is acceptable. Our study shows that, for many assays, predicting compound activity from phenotypic profiles and chemical structures might accelerate the early stages of the drug-discovery process.

gitter-lab / pharmaco-image

Predicting compound activity from phenotypic profiles and chemical structures #19